Generating PDF reports

The Python ecosystem has everything needed to create PDF reports with a custom layout. This guide shows how to generate a PDF report extracting data from an Atoti session and displaying it as a table spanning over multiple pages and a chart.

Creating the session

Let’s start by creating an Atoti session and loading some data into it:

>>> import atoti as tt
>>> session = tt.Session.start()  

>>> sales_table = session.read_csv(
...     data_directory / "sales.csv",
...     keys={"Sale ID"},
...     table_name="Sales",
... )
>>> products_table = session.read_csv(
...     data_directory / "products.csv",
...     keys={"Product"},
...     table_name="Products",
... )
>>> sales_table.join(
...     products_table, sales_table["Product"] == products_table["Product"]
... )
>>> cube = session.create_cube(sales_table)
>>> l, m = cube.levels, cube.measures

Querying the session

The first query will retrieve the values of a few measures for each Sub category and Brand.

Styling tables

The result of this first query will be displayed as a table in our report. We’ll use three different techniques to style it:

Defining cell properties in the MDX query

In the query below, we use calculated measures to style the table cells. In particular, FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL) in the following query ensures that the cells for the contributors.COUNT measure are colored red if their value is less than 300.

>>> mdx = '''
...     WITH
...     MEMBER [Measures].[Count] as [Measures].[contributors.COUNT],
...     FORE_COLOR=IIf([Measures].[Count] < 300, RGB(255, 135, 135), NULL)
...     MEMBER [Measures].[Quantity] as [Measures].[Quantity.SUM],
...     FORE_COLOR=IIf([Measures].[Quantity.SUM] > 700, RGB(50, 150, 50), NULL)
...     SELECT
...       NON EMPTY Crossjoin(
...         Hierarchize(
...           Descendants(
...             {
...               [Products].[Sub category].[AllMember]
...             },
...             1,
...             SELF_AND_BEFORE
...           )
...         ),
...         Hierarchize(
...           Descendants(
...             {
...               [Products].[Brand].[AllMember]
...             },
...             1,
...             SELF_AND_BEFORE
...           )
...         )
...       ) ON ROWS,
...       NON EMPTY {
...         [Measures].[Count],
...         [Measures].[Quantity],
...         [Measures].[Unit price.MEAN],
...         [Measures].[Unit price.SUM]
...       } ON COLUMNS
...     FROM [Sales]
...     CELL PROPERTIES
...       FORE_COLOR,
...       FORMATTED_VALUE,
...       VALUE
... '''
>>> df = session.query_mdx(mdx, keep_totals=True)
>>> df.style  

Applying style with pandas

pandas provides many ways to style a DataFrame. Here, we’ll apply a background color to a specified row:

>>> index_slice = pd.IndexSlice
>>> subset = index_slice[index_slice["Chair", "Mega"], :]
>>> df_style = df.style.set_properties(
...     **{"background-color": "#ffff92"},
...     subset=subset,
... )
>>> df_style  

Adding style with CSS

Before being a PDF, our report will be an HTML file so we can use CSS to style it:

>>> table_css

table {
  border-collapse: collapse;
}

td {
  border: 1px solid black;
  padding: 8px;
  text-align: right;
}

th {
  border: 1px solid black;
  padding: 8px;
  text-align: center;
}

thead {
  background-color: lightgray;
}

Creating the chart

Many Python libraries can create charts from pandas DataFrames. In this how-to, we’ll use seaborn which is a wrapper around Matplotlib:

>>> import io
>>> import matplotlib_inline.backend_inline
>>> # Render charts as vector graphics so they stay sharp at any resolution.
>>> matplotlib_inline.backend_inline.set_matplotlib_formats("svg")
>>> import matplotlib.pyplot as plt
>>> import seaborn as sns

>>> data = cube.query(
...     m["Quantity.SUM"], levels=[l["Sub category"]]
... ).reset_index()
>>> _ = sns.barplot(
...     data=data,
...     x="Sub category",
...     y="Quantity.SUM",
... ).set(title="Quantity.SUM by Sub category")
>>> chart = io.BytesIO()
>>> plt.savefig(chart, format="svg", metadata={"Date": None})

Creating the HTML report

We’ll use Jinja to render this template:

>>> template_path = resources_directory / "template.jinja"
>>> template_code = template_path.read_text()
>>> template_code  

<html>
  <head>
    <style>
      {{ table_css }}
    </style>
  </head>
  <body>
    <header
      style="
        display: flex;
        align-items: center;
        justify-content: space-between;
        width: 100%;
      "
    >
      <h1>{{ report_name }}</h1>
      <h5>Report Date: {{ date }}</h5>
    </header>
    <div>
      <h3>{{ table_title }}</h3>
      <div>{{ table }}</div>
    </div>
    <br />
    <div>{{ chart }}</div>
  </body>
</html>

With this data:

>>> from datetime import date
>>> template_data = {
...     "chart": chart.getvalue().decode(),
...     "date": date(2024, 8, 10),
...     "report_name": "Atoti PDF Report",
...     "table": df_style.to_html(),
...     "table_css": table_css,
...     "table_title": "Measures by Sub category and Brand",
... }

>>> from jinja2 import Environment, FileSystemLoader
>>> environment = Environment(
...     autoescape=False,
...     loader=FileSystemLoader(template_path.parent),
... )
>>> template = environment.get_template(template_path.name)
>>> html = template.render(template_data)

Exporting the report as a PDF

WeasyPrint can render HTML as a PDF:

>>> import weasyprint
>>> pdf_path = output_directory / "report.pdf"
>>> document = weasyprint.HTML(string=html)
>>> document.write_pdf(pdf_path)

Going further

The report could be sent by email periodically using the schedule library or:

Task Scheduler on Windows
cron on Linux or macOS
A Lambda function in the cloud

​Creating the session

​Querying the session

​Styling tables

​Defining cell properties in the MDX query

​Applying style with pandas

​Adding style with CSS

​Creating the chart

​Creating the HTML report

​Exporting the report as a PDF

​Going further