Python Libraries for Excel Report Generation

This guide explains each library used in NetSuite Python report development and how they work together to generate Excel files.

Overview: How Libraries Work Together

┌─────────────────────────────────────────────────────────────────────────────┐
│                        EXCEL REPORT GENERATION FLOW                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌──────────┐    ┌──────────┐    ┌─────────────┐    ┌──────────────────┐   │
│  │  NumPy   │───►│  Pandas  │───►│ XlsxWriter  │───►│  Excel File      │   │
│  │          │    │          │    │             │    │  (.xlsx)         │   │
│  └──────────┘    └──────────┘    └─────────────┘    └──────────────────┘   │
│       │               │                │                                    │
│       ▼               ▼                ▼                                    │
│  Calculations    Data Processing   Formatting                              │
│  & Arrays        & Transformation  & Styling                               │
│                                                                             │
│  ┌──────────┐    ┌──────────┐                                              │
│  │ OpenPyXL │    │ Jupyter  │                                              │
│  │          │    │ Notebook │                                              │
│  └──────────┘    └──────────┘                                              │
│       │               │                                                     │
│       ▼               ▼                                                     │
│  Reading Excel   Development                                               │
│  Files           Environment                                               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Jupyter Notebook

What is Jupyter?

Jupyter Notebook is an interactive development environment that runs in your browser. It allows you to write and execute Python code in "cells" and see results immediately.

Why Use Jupyter for Report Development?

Feature	Benefit
Cell-based execution	Run code piece by piece, not all at once
Immediate feedback	See results instantly after each cell
Data visualization	View DataFrames as formatted tables
Easy debugging	Test and fix code interactively
Documentation	Mix code with markdown explanations

Jupyter Workflow Example

# Cell 1: Load your data
import pandas as pd
import json

with open('test_data.json', 'r') as f:
    data = json.load(f)

print(f"Loaded {len(data)} records")

# Cell 2: Process data (run separately)
df = pd.DataFrame(data)
df.head()  # See first 5 rows immediately

# Cell 3: Check calculations
df['total'] = df['amount1'] + df['amount2']
df[['amount1', 'amount2', 'total']]  # Verify calculation

# Cell 4: Generate Excel (only run when ready)
df.to_excel('output.xlsx', index=False)
print("Excel file created!")

Key Jupyter Shortcuts

Shortcut	Action
`Shift+Enter`	Run cell and move to next
`Ctrl+Enter`	Run cell and stay
`B`	Insert cell below
`A`	Insert cell above
`DD`	Delete cell
`M`	Change to Markdown
`Y`	Change to Code

Pandas

What is Pandas?

Pandas is the core library for data manipulation. Think of it as "Excel formulas and pivot tables in Python".

Core Concept: DataFrame

A DataFrame is like an Excel spreadsheet - it has rows and columns.

import pandas as pd

# Creating a DataFrame from NetSuite data (JSON format)
data = [
    {"customer": "PT. ABC", "currency": "IDR", "amount": 5000000},
    {"customer": "XYZ Corp", "currency": "USD", "amount": 1000},
    {"customer": "PT. DEF", "currency": "IDR", "amount": 3000000}
]

df = pd.DataFrame(data)
print(df)

Output:

    customer currency   amount
  PT. ABC      IDR  5000000
 XYZ Corp      USD     1000
  PT. DEF      IDR  3000000

Common Pandas Operations for Reports

1. Filtering Data (Like Excel Filter)

# Filter rows where currency is IDR
idr_customers = df[df['currency'] == 'IDR']

# Filter with multiple conditions
large_idr = df[(df['currency'] == 'IDR') & (df['amount'] > 4000000)]

2. Adding Calculated Columns (Like Excel Formulas)

# Simple calculation
df['amount_with_tax'] = df['amount'] * 1.11

# Conditional calculation (like IF formula)
import numpy as np
df['category'] = np.where(df['amount'] > 1000000, 'Large', 'Small')

3. Grouping and Aggregation (Like Pivot Tables)

# Sum by currency
by_currency = df.groupby('currency')['amount'].sum()

# Multiple aggregations
summary = df.groupby('currency').agg({
    'amount': ['sum', 'mean', 'count']
})

4. Sorting Data

# Sort by amount descending
df_sorted = df.sort_values('amount', ascending=False)

# Sort by multiple columns
df_sorted = df.sort_values(['currency', 'amount'], ascending=[True, False])

5. Handling Missing Data

# Check for missing values
print(df.isnull().sum())

# Fill missing values
df = df.fillna(0)  # Replace NaN with 0
df = df.fillna({'amount': 0, 'customer': 'Unknown'})

Pandas to Excel Connection

# Basic export
df.to_excel('report.xlsx', index=False)

# Export with XlsxWriter engine for formatting
with pd.ExcelWriter('report.xlsx', engine='xlsxwriter') as writer:
    df.to_excel(writer, sheet_name='Data', index=False)

    # Access workbook for formatting
    workbook = writer.book
    worksheet = writer.sheets['Data']

NumPy

What is NumPy?

NumPy (Numerical Python) provides fast mathematical operations on arrays. Pandas is built on top of NumPy.

Why NumPy for Reports?

Use Case	NumPy Function
Simple IF-ELSE	`np.where()`
Multiple conditions	`np.select()`
Mathematical operations	`np.sum()`, `np.mean()`
Handling NaN values	`np.nan`, `np.isnan()`

Common NumPy Operations

1. Simple Conditional Logic (IF-ELSE)

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'amount': [500000, 2000000, 800000, 5000000]
})

# Like Excel: =IF(A1>1000000, "Large", "Small")
df['size'] = np.where(df['amount'] > 1000000, 'Large', 'Small')

Result:

    amount   size
 500000  Small
2000000  Large
 800000  Small
5000000  Large

2. Multiple Conditions (Nested IF)

# Like Excel: =IF(A1>5M, "Very Large", IF(A1>1M, "Large", IF(A1>500K, "Medium", "Small")))

conditions = [
    df['amount'] > 5000000,
    df['amount'] > 1000000,
    df['amount'] > 500000
]
choices = ['Very Large', 'Large', 'Medium']

df['category'] = np.select(conditions, choices, default='Small')

3. Currency Conversion

# Convert foreign currency to IDR
conditions = [
    df['currency'] == 'USD',
    df['currency'] == 'SGD',
    df['currency'] == 'EUR'
]
rates = [15500, 11500, 17000]

df['amount_idr'] = np.select(
    conditions,
    [df['amount'] * rate for rate in rates],
    default=df['amount']  # Already IDR
)

4. Handling NaN in Calculations

# Replace NaN with 0 for calculations
df['amount'] = np.nan_to_num(df['amount'], nan=0)

# Check if value is NaN
df['is_missing'] = np.isnan(df['amount'])

XlsxWriter

What is XlsxWriter?

XlsxWriter creates Excel files with full formatting control - fonts, colors, borders, merged cells, conditional formatting, and more.

Why XlsxWriter?

Feature	Description
Cell Formatting	Fonts, colors, borders, alignment
Number Formats	Currency, percentages, dates
Merged Cells	For headers and titles
Conditional Formatting	Highlight based on values
Column Widths	Auto or manual sizing
Multiple Sheets	Create workbooks with many sheets

XlsxWriter Workflow

import xlsxwriter
import io

# Create workbook in memory (for Pyodide/download)
buffer = io.BytesIO()
workbook = xlsxwriter.Workbook(buffer)
worksheet = workbook.add_worksheet('Report')

# Define formats
header_format = workbook.add_format({
    'bold': True,
    'font_size': 12,
    'font_color': 'white',
    'bg_color': '#2563eb',
    'align': 'center',
    'valign': 'vcenter',
    'border': 1
})

currency_format = workbook.add_format({
    'num_format': '#,##0',
    'align': 'right',
    'border': 1
})

# Write data with formatting
worksheet.write('A1', 'Customer', header_format)
worksheet.write('B1', 'Amount', header_format)
worksheet.write('A2', 'PT. ABC')
worksheet.write('B2', 5000000, currency_format)

# Set column widths
worksheet.set_column('A:A', 25)
worksheet.set_column('B:B', 15)

workbook.close()

Common Format Properties

# Font formatting
font_format = workbook.add_format({
    'bold': True,
    'italic': True,
    'font_size': 14,
    'font_name': 'Arial',
    'font_color': '#FF0000'
})

# Alignment
align_format = workbook.add_format({
    'align': 'center',      # horizontal: left, center, right
    'valign': 'vcenter',    # vertical: top, vcenter, bottom
    'text_wrap': True
})

# Borders
border_format = workbook.add_format({
    'border': 1,            # thin border all sides
    'top': 2,               # medium top border
    'bottom': 1,
    'left': 1,
    'right': 1
})

# Number formats
number_formats = {
    'integer': '#,##0',
    'decimal': '#,##0.00',
    'currency_idr': '"Rp" #,##0',
    'currency_usd': '"$" #,##0.00',
    'percentage': '0.00%',
    'date': 'dd/mm/yyyy'
}

Merging Cells for Headers

title_format = workbook.add_format({
    'bold': True,
    'font_size': 16,
    'align': 'center',
    'valign': 'vcenter'
})

# Merge A1:E1 for title
worksheet.merge_range('A1:E1', 'PT. COMPANY NAME', title_format)
worksheet.merge_range('A2:E2', 'Monthly Sales Report', title_format)
worksheet.merge_range('A3:E3', 'December 2024', title_format)

Conditional Formatting

# Highlight values > 1,000,000 in green
worksheet.conditional_format('B2:B100', {
    'type': 'cell',
    'criteria': '>',
    'value': 1000000,
    'format': workbook.add_format({'bg_color': '#C6EFCE'})
})

# Highlight negative values in red
worksheet.conditional_format('B2:B100', {
    'type': 'cell',
    'criteria': '<',
    'value': 0,
    'format': workbook.add_format({
        'bg_color': '#FFC7CE',
        'font_color': '#9C0006'
    })
})

OpenPyXL

What is OpenPyXL?

OpenPyXL is used for reading existing Excel files. While XlsxWriter creates new files, OpenPyXL can read and modify existing ones.

When to Use OpenPyXL

Scenario	Use
Create new Excel from scratch	XlsxWriter
Read existing Excel file	OpenPyXL
Modify existing Excel	OpenPyXL
Template-based reports	OpenPyXL

Reading Excel Files

import pandas as pd

# Read Excel file into DataFrame
df = pd.read_excel('existing_file.xlsx', sheet_name='Sheet1')

# Read specific columns
df = pd.read_excel('file.xlsx', usecols=['A', 'B', 'C'])

# Read specific rows
df = pd.read_excel('file.xlsx', skiprows=5, nrows=100)

Reading with OpenPyXL Directly

from openpyxl import load_workbook

# Load existing workbook
wb = load_workbook('template.xlsx')
ws = wb.active

# Read cell value
value = ws['A1'].value

# Iterate through rows
for row in ws.iter_rows(min_row=2, max_row=100, min_col=1, max_col=5):
    for cell in row:
        print(cell.value)

Complete Example: All Libraries Together

# Cell 1: Imports
import pandas as pd
import numpy as np
import xlsxwriter
import io
import json

# Cell 2: Load Data (simulating NetSuite data)
data = [
    {"customer": "PT. ABC", "currency": "IDR", "current": 5000000, "overdue_30": 2000000, "overdue_60": 0},
    {"customer": "XYZ Corp", "currency": "USD", "current": 1000, "overdue_30": 500, "overdue_60": 200},
    {"customer": "PT. DEF", "currency": "IDR", "current": 3000000, "overdue_30": 0, "overdue_60": 1500000},
]

df = pd.DataFrame(data)
print(f"Loaded {len(df)} records")
df.head()

# Cell 3: Process Data with Pandas & NumPy
# Fill any missing values
df = df.fillna(0)

# Calculate totals
df['total_overdue'] = df['overdue_30'] + df['overdue_60']
df['total'] = df['current'] + df['total_overdue']

# Add category using NumPy
df['risk'] = np.where(df['total_overdue'] > 1000000, 'High',
             np.where(df['total_overdue'] > 0, 'Medium', 'Low'))

# Convert USD to IDR
df['total_idr'] = np.where(
    df['currency'] == 'USD',
    df['total'] * 15500,
    df['total']
)

print("Processed data:")
df

# Cell 4: Generate Excel with XlsxWriter
buffer = io.BytesIO()

with pd.ExcelWriter(buffer, engine='xlsxwriter') as writer:
    # Write data starting at row 5
    df.to_excel(writer, sheet_name='AR Aging', startrow=5, index=False)

    workbook = writer.book
    worksheet = writer.sheets['AR Aging']

    # Define formats
    title_fmt = workbook.add_format({
        'bold': True, 'font_size': 16, 'align': 'center'
    })
    header_fmt = workbook.add_format({
        'bold': True, 'bg_color': '#2563eb', 'font_color': 'white',
        'align': 'center', 'border': 1
    })
    currency_fmt = workbook.add_format({
        'num_format': '#,##0', 'border': 1
    })
    high_risk_fmt = workbook.add_format({
        'bg_color': '#FFC7CE', 'border': 1
    })

    # Write title (merged cells)
    worksheet.merge_range('A1:H1', 'PT. COMPANY NAME', title_fmt)
    worksheet.merge_range('A2:H2', 'AR Aging Report', title_fmt)
    worksheet.merge_range('A3:H3', 'As of December 2024', title_fmt)

    # Format header row
    for col_num, column in enumerate(df.columns):
        worksheet.write(5, col_num, column, header_fmt)

    # Set column widths
    worksheet.set_column('A:A', 20)  # Customer
    worksheet.set_column('B:B', 10)  # Currency
    worksheet.set_column('C:H', 15)  # Numbers

    # Conditional formatting for high risk
    worksheet.conditional_format('G7:G100', {
        'type': 'text',
        'criteria': 'containing',
        'value': 'High',
        'format': high_risk_fmt
    })

# Save to file
with open('ar_aging_report.xlsx', 'wb') as f:
    buffer.seek(0)
    f.write(buffer.read())

print("Excel report generated: ar_aging_report.xlsx")

Summary: When to Use Each Library

Library	Primary Purpose	Key Functions
Jupyter	Development environment	Interactive coding, debugging
Pandas	Data manipulation	DataFrame, filtering, grouping, aggregation
NumPy	Calculations & logic	`np.where()`, `np.select()`, math operations
XlsxWriter	Excel creation & formatting	Formats, merged cells, conditional formatting
OpenPyXL	Reading Excel files	Load templates, read existing data

Quick Reference

# Pandas: Data processing
df = pd.DataFrame(data)
df = df.fillna(0)
df_filtered = df[df['amount'] > 1000]
df_grouped = df.groupby('category').sum()

# NumPy: Calculations
df['status'] = np.where(condition, 'Yes', 'No')
df['category'] = np.select(conditions, choices, default='Other')

# XlsxWriter: Formatting
format = workbook.add_format({'bold': True, 'bg_color': '#FFFF00'})
worksheet.write('A1', 'Header', format)
worksheet.merge_range('A1:C1', 'Title', format)
worksheet.set_column('A:A', 20)

Overview: How Libraries Work Together​

Jupyter Notebook​

What is Jupyter?​

Why Use Jupyter for Report Development?​

Jupyter Workflow Example​

Key Jupyter Shortcuts​

Pandas​

What is Pandas?​

Core Concept: DataFrame​

Common Pandas Operations for Reports​

1. Filtering Data (Like Excel Filter)​

2. Adding Calculated Columns (Like Excel Formulas)​

3. Grouping and Aggregation (Like Pivot Tables)​

4. Sorting Data​

5. Handling Missing Data​

Pandas to Excel Connection​

NumPy​

What is NumPy?​

Why NumPy for Reports?​

Common NumPy Operations​

1. Simple Conditional Logic (IF-ELSE)​

2. Multiple Conditions (Nested IF)​

3. Currency Conversion​

4. Handling NaN in Calculations​

XlsxWriter​

What is XlsxWriter?​

Why XlsxWriter?​

XlsxWriter Workflow​

Common Format Properties​

Merging Cells for Headers​

Conditional Formatting​

OpenPyXL​

What is OpenPyXL?​

When to Use OpenPyXL​

Reading Excel Files​

Reading with OpenPyXL Directly​

Complete Example: All Libraries Together​

Summary: When to Use Each Library​

Quick Reference​