Skip to main content

Python Libraries for Excel Report Generation

This guide explains each library used in NetSuite Python report development and how they work together to generate Excel files.


Overview: How Libraries Work Together

┌─────────────────────────────────────────────────────────────────────────────┐
│ EXCEL REPORT GENERATION FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌─────────────┐ ┌──────────────────┐ │
│ │ NumPy │───►│ Pandas │───►│ XlsxWriter │───►│ Excel File │ │
│ │ │ │ │ │ │ │ (.xlsx) │ │
│ └──────────┘ └──────────┘ └─────────────┘ └──────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Calculations Data Processing Formatting │
│ & Arrays & Transformation & Styling │
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ OpenPyXL │ │ Jupyter │ │
│ │ │ │ Notebook │ │
│ └──────────┘ └──────────┘ │
│ │ │ │
│ ▼ ▼ │
│ Reading Excel Development │
│ Files Environment │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

Jupyter Notebook

What is Jupyter?

Jupyter Notebook is an interactive development environment that runs in your browser. It allows you to write and execute Python code in "cells" and see results immediately.

Why Use Jupyter for Report Development?

FeatureBenefit
Cell-based executionRun code piece by piece, not all at once
Immediate feedbackSee results instantly after each cell
Data visualizationView DataFrames as formatted tables
Easy debuggingTest and fix code interactively
DocumentationMix code with markdown explanations

Jupyter Workflow Example

# Cell 1: Load your data
import pandas as pd
import json

with open('test_data.json', 'r') as f:
data = json.load(f)

print(f"Loaded {len(data)} records")
# Cell 2: Process data (run separately)
df = pd.DataFrame(data)
df.head() # See first 5 rows immediately
# Cell 3: Check calculations
df['total'] = df['amount1'] + df['amount2']
df[['amount1', 'amount2', 'total']] # Verify calculation
# Cell 4: Generate Excel (only run when ready)
df.to_excel('output.xlsx', index=False)
print("Excel file created!")

Key Jupyter Shortcuts

ShortcutAction
Shift+EnterRun cell and move to next
Ctrl+EnterRun cell and stay
BInsert cell below
AInsert cell above
DDDelete cell
MChange to Markdown
YChange to Code

Pandas

What is Pandas?

Pandas is the core library for data manipulation. Think of it as "Excel formulas and pivot tables in Python".

Core Concept: DataFrame

A DataFrame is like an Excel spreadsheet - it has rows and columns.

import pandas as pd

# Creating a DataFrame from NetSuite data (JSON format)
data = [
{"customer": "PT. ABC", "currency": "IDR", "amount": 5000000},
{"customer": "XYZ Corp", "currency": "USD", "amount": 1000},
{"customer": "PT. DEF", "currency": "IDR", "amount": 3000000}
]

df = pd.DataFrame(data)
print(df)

Output:

    customer currency   amount
0 PT. ABC IDR 5000000
1 XYZ Corp USD 1000
2 PT. DEF IDR 3000000

Common Pandas Operations for Reports

1. Filtering Data (Like Excel Filter)

# Filter rows where currency is IDR
idr_customers = df[df['currency'] == 'IDR']

# Filter with multiple conditions
large_idr = df[(df['currency'] == 'IDR') & (df['amount'] > 4000000)]

2. Adding Calculated Columns (Like Excel Formulas)

# Simple calculation
df['amount_with_tax'] = df['amount'] * 1.11

# Conditional calculation (like IF formula)
import numpy as np
df['category'] = np.where(df['amount'] > 1000000, 'Large', 'Small')

3. Grouping and Aggregation (Like Pivot Tables)

# Sum by currency
by_currency = df.groupby('currency')['amount'].sum()

# Multiple aggregations
summary = df.groupby('currency').agg({
'amount': ['sum', 'mean', 'count']
})

4. Sorting Data

# Sort by amount descending
df_sorted = df.sort_values('amount', ascending=False)

# Sort by multiple columns
df_sorted = df.sort_values(['currency', 'amount'], ascending=[True, False])

5. Handling Missing Data

# Check for missing values
print(df.isnull().sum())

# Fill missing values
df = df.fillna(0) # Replace NaN with 0
df = df.fillna({'amount': 0, 'customer': 'Unknown'})

Pandas to Excel Connection

# Basic export
df.to_excel('report.xlsx', index=False)

# Export with XlsxWriter engine for formatting
with pd.ExcelWriter('report.xlsx', engine='xlsxwriter') as writer:
df.to_excel(writer, sheet_name='Data', index=False)

# Access workbook for formatting
workbook = writer.book
worksheet = writer.sheets['Data']

NumPy

What is NumPy?

NumPy (Numerical Python) provides fast mathematical operations on arrays. Pandas is built on top of NumPy.

Why NumPy for Reports?

Use CaseNumPy Function
Simple IF-ELSEnp.where()
Multiple conditionsnp.select()
Mathematical operationsnp.sum(), np.mean()
Handling NaN valuesnp.nan, np.isnan()

Common NumPy Operations

1. Simple Conditional Logic (IF-ELSE)

import numpy as np
import pandas as pd

df = pd.DataFrame({
'amount': [500000, 2000000, 800000, 5000000]
})

# Like Excel: =IF(A1>1000000, "Large", "Small")
df['size'] = np.where(df['amount'] > 1000000, 'Large', 'Small')

Result:

    amount   size
0 500000 Small
1 2000000 Large
2 800000 Small
3 5000000 Large

2. Multiple Conditions (Nested IF)

# Like Excel: =IF(A1>5M, "Very Large", IF(A1>1M, "Large", IF(A1>500K, "Medium", "Small")))

conditions = [
df['amount'] > 5000000,
df['amount'] > 1000000,
df['amount'] > 500000
]
choices = ['Very Large', 'Large', 'Medium']

df['category'] = np.select(conditions, choices, default='Small')

3. Currency Conversion

# Convert foreign currency to IDR
conditions = [
df['currency'] == 'USD',
df['currency'] == 'SGD',
df['currency'] == 'EUR'
]
rates = [15500, 11500, 17000]

df['amount_idr'] = np.select(
conditions,
[df['amount'] * rate for rate in rates],
default=df['amount'] # Already IDR
)

4. Handling NaN in Calculations

# Replace NaN with 0 for calculations
df['amount'] = np.nan_to_num(df['amount'], nan=0)

# Check if value is NaN
df['is_missing'] = np.isnan(df['amount'])

XlsxWriter

What is XlsxWriter?

XlsxWriter creates Excel files with full formatting control - fonts, colors, borders, merged cells, conditional formatting, and more.

Why XlsxWriter?

FeatureDescription
Cell FormattingFonts, colors, borders, alignment
Number FormatsCurrency, percentages, dates
Merged CellsFor headers and titles
Conditional FormattingHighlight based on values
Column WidthsAuto or manual sizing
Multiple SheetsCreate workbooks with many sheets

XlsxWriter Workflow

import xlsxwriter
import io

# Create workbook in memory (for Pyodide/download)
buffer = io.BytesIO()
workbook = xlsxwriter.Workbook(buffer)
worksheet = workbook.add_worksheet('Report')

# Define formats
header_format = workbook.add_format({
'bold': True,
'font_size': 12,
'font_color': 'white',
'bg_color': '#2563eb',
'align': 'center',
'valign': 'vcenter',
'border': 1
})

currency_format = workbook.add_format({
'num_format': '#,##0',
'align': 'right',
'border': 1
})

# Write data with formatting
worksheet.write('A1', 'Customer', header_format)
worksheet.write('B1', 'Amount', header_format)
worksheet.write('A2', 'PT. ABC')
worksheet.write('B2', 5000000, currency_format)

# Set column widths
worksheet.set_column('A:A', 25)
worksheet.set_column('B:B', 15)

workbook.close()

Common Format Properties

# Font formatting
font_format = workbook.add_format({
'bold': True,
'italic': True,
'font_size': 14,
'font_name': 'Arial',
'font_color': '#FF0000'
})

# Alignment
align_format = workbook.add_format({
'align': 'center', # horizontal: left, center, right
'valign': 'vcenter', # vertical: top, vcenter, bottom
'text_wrap': True
})

# Borders
border_format = workbook.add_format({
'border': 1, # thin border all sides
'top': 2, # medium top border
'bottom': 1,
'left': 1,
'right': 1
})

# Number formats
number_formats = {
'integer': '#,##0',
'decimal': '#,##0.00',
'currency_idr': '"Rp" #,##0',
'currency_usd': '"$" #,##0.00',
'percentage': '0.00%',
'date': 'dd/mm/yyyy'
}

Merging Cells for Headers

title_format = workbook.add_format({
'bold': True,
'font_size': 16,
'align': 'center',
'valign': 'vcenter'
})

# Merge A1:E1 for title
worksheet.merge_range('A1:E1', 'PT. COMPANY NAME', title_format)
worksheet.merge_range('A2:E2', 'Monthly Sales Report', title_format)
worksheet.merge_range('A3:E3', 'December 2024', title_format)

Conditional Formatting

# Highlight values > 1,000,000 in green
worksheet.conditional_format('B2:B100', {
'type': 'cell',
'criteria': '>',
'value': 1000000,
'format': workbook.add_format({'bg_color': '#C6EFCE'})
})

# Highlight negative values in red
worksheet.conditional_format('B2:B100', {
'type': 'cell',
'criteria': '<',
'value': 0,
'format': workbook.add_format({
'bg_color': '#FFC7CE',
'font_color': '#9C0006'
})
})

OpenPyXL

What is OpenPyXL?

OpenPyXL is used for reading existing Excel files. While XlsxWriter creates new files, OpenPyXL can read and modify existing ones.

When to Use OpenPyXL

ScenarioUse
Create new Excel from scratchXlsxWriter
Read existing Excel fileOpenPyXL
Modify existing ExcelOpenPyXL
Template-based reportsOpenPyXL

Reading Excel Files

import pandas as pd

# Read Excel file into DataFrame
df = pd.read_excel('existing_file.xlsx', sheet_name='Sheet1')

# Read specific columns
df = pd.read_excel('file.xlsx', usecols=['A', 'B', 'C'])

# Read specific rows
df = pd.read_excel('file.xlsx', skiprows=5, nrows=100)

Reading with OpenPyXL Directly

from openpyxl import load_workbook

# Load existing workbook
wb = load_workbook('template.xlsx')
ws = wb.active

# Read cell value
value = ws['A1'].value

# Iterate through rows
for row in ws.iter_rows(min_row=2, max_row=100, min_col=1, max_col=5):
for cell in row:
print(cell.value)

Complete Example: All Libraries Together

# Cell 1: Imports
import pandas as pd
import numpy as np
import xlsxwriter
import io
import json

# Cell 2: Load Data (simulating NetSuite data)
data = [
{"customer": "PT. ABC", "currency": "IDR", "current": 5000000, "overdue_30": 2000000, "overdue_60": 0},
{"customer": "XYZ Corp", "currency": "USD", "current": 1000, "overdue_30": 500, "overdue_60": 200},
{"customer": "PT. DEF", "currency": "IDR", "current": 3000000, "overdue_30": 0, "overdue_60": 1500000},
]

df = pd.DataFrame(data)
print(f"Loaded {len(df)} records")
df.head()

# Cell 3: Process Data with Pandas & NumPy
# Fill any missing values
df = df.fillna(0)

# Calculate totals
df['total_overdue'] = df['overdue_30'] + df['overdue_60']
df['total'] = df['current'] + df['total_overdue']

# Add category using NumPy
df['risk'] = np.where(df['total_overdue'] > 1000000, 'High',
np.where(df['total_overdue'] > 0, 'Medium', 'Low'))

# Convert USD to IDR
df['total_idr'] = np.where(
df['currency'] == 'USD',
df['total'] * 15500,
df['total']
)

print("Processed data:")
df

# Cell 4: Generate Excel with XlsxWriter
buffer = io.BytesIO()

with pd.ExcelWriter(buffer, engine='xlsxwriter') as writer:
# Write data starting at row 5
df.to_excel(writer, sheet_name='AR Aging', startrow=5, index=False)

workbook = writer.book
worksheet = writer.sheets['AR Aging']

# Define formats
title_fmt = workbook.add_format({
'bold': True, 'font_size': 16, 'align': 'center'
})
header_fmt = workbook.add_format({
'bold': True, 'bg_color': '#2563eb', 'font_color': 'white',
'align': 'center', 'border': 1
})
currency_fmt = workbook.add_format({
'num_format': '#,##0', 'border': 1
})
high_risk_fmt = workbook.add_format({
'bg_color': '#FFC7CE', 'border': 1
})

# Write title (merged cells)
worksheet.merge_range('A1:H1', 'PT. COMPANY NAME', title_fmt)
worksheet.merge_range('A2:H2', 'AR Aging Report', title_fmt)
worksheet.merge_range('A3:H3', 'As of December 2024', title_fmt)

# Format header row
for col_num, column in enumerate(df.columns):
worksheet.write(5, col_num, column, header_fmt)

# Set column widths
worksheet.set_column('A:A', 20) # Customer
worksheet.set_column('B:B', 10) # Currency
worksheet.set_column('C:H', 15) # Numbers

# Conditional formatting for high risk
worksheet.conditional_format('G7:G100', {
'type': 'text',
'criteria': 'containing',
'value': 'High',
'format': high_risk_fmt
})

# Save to file
with open('ar_aging_report.xlsx', 'wb') as f:
buffer.seek(0)
f.write(buffer.read())

print("Excel report generated: ar_aging_report.xlsx")

Summary: When to Use Each Library

LibraryPrimary PurposeKey Functions
JupyterDevelopment environmentInteractive coding, debugging
PandasData manipulationDataFrame, filtering, grouping, aggregation
NumPyCalculations & logicnp.where(), np.select(), math operations
XlsxWriterExcel creation & formattingFormats, merged cells, conditional formatting
OpenPyXLReading Excel filesLoad templates, read existing data

Quick Reference

# Pandas: Data processing
df = pd.DataFrame(data)
df = df.fillna(0)
df_filtered = df[df['amount'] > 1000]
df_grouped = df.groupby('category').sum()

# NumPy: Calculations
df['status'] = np.where(condition, 'Yes', 'No')
df['category'] = np.select(conditions, choices, default='Other')

# XlsxWriter: Formatting
format = workbook.add_format({'bold': True, 'bg_color': '#FFFF00'})
worksheet.write('A1', 'Header', format)
worksheet.merge_range('A1:C1', 'Title', format)
worksheet.set_column('A:A', 20)