Reports & Outputs

Understanding Aparture's output files and report structure.

Output Files

Aparture generates three types of files, all saved to the reports/ directory:

Analysis Report

Filename: YYYY-MM-DD_arxiv_analysis_XXmin.md

The primary output containing complete analysis results.

Example: 2025-10-14_arxiv_analysis_45min.md

Format: Markdown (human-readable, GitHub-compatible)

Size: 50-500 KB depending on paper count and PDF analyses

NotebookLM Document

Filename: YYYY-MM-DD_notebooklm_XXmin.md

Podcast-optimized document structured for audio generation.

Example: 2025-10-14_notebooklm_15min.md

Format: Markdown with special structuring

Size: 30-150 KB

Purpose: Upload to notebooklm.google.com for podcast creation

Audio Podcast

Filename: YYYY-MM-DD_podcast.m4a

AI-generated audio overview (requires CLI automation).

Example: 2025-10-14_podcast.m4a

Format: M4A (audio)

Size: 5-30 MB depending on duration

Duration: 5-30 minutes (configurable)

Podcasts require CLI automation

The web interface can generate the NotebookLM document, but you'll need to upload it manually to Google NotebookLM to create the podcast. Use CLI automation for fully automated podcast generation.

Analysis Report Structure

The analysis report follows a consistent structure:

Header Section

markdown

# arXiv Analysis Report

**Date**: 2025-10-14
**Analysis Duration**: 45 minutes
**Generated by**: Aparture v1.0

## Configuration

- **Categories**: cs.LG, cs.AI, stat.ML (47 papers)
- **Quick Filter**: Enabled (Haiku 4.5)
- **Abstract Scoring**: Enabled (Sonnet 4.5)
- **PDF Analysis**: Top 20 papers (Opus 4.1)

Key information:

Run date and duration
Categories analyzed
Paper counts
Models used

Executive Summary

markdown

## Executive Summary

Analyzed 47 recent papers from arXiv. Key findings:

**Top Themes:**

- 12 papers on transformer architectures
- 8 papers on Bayesian methods
- 6 papers on interpretability

**Highlights:**

- Novel attention mechanism (Score: 9.2)
- Scalable inference method (Score: 8.8)
- New benchmark dataset (Score: 8.5)

**Recommendations:**
Priority reading: Papers #1, #3, #5 for immediate relevance.

Provides quick overview of:

Paper distribution
Major themes
Top highlights
Reading recommendations

Stage Results

Each processing stage gets a dedicated section:

Quick Filter Results

markdown

## Stage 1: Quick Filter

**Model**: Claude Haiku 4.5
**Duration**: 2 minutes 34 seconds
**Cost**: $0.08

**Results:**

- ✓ YES: 18 papers (38%)
- ~ MAYBE: 12 papers (26%)
- ✗ NO: 17 papers (36%)

**Filtered out**: 17 papers
**Proceeding to scoring**: 30 papers

Shows filtering effectiveness and cost.

Abstract Scoring Results

markdown

## Stage 2: Abstract Scoring

**Model**: Claude Sonnet 4.5
**Duration**: 15 minutes 22 seconds
**Cost**: $1.45

**Score Distribution:**

- 9-10: 3 papers (10%)
- 7-8: 8 papers (27%)
- 5-6: 12 papers (40%)
- 3-4: 7 papers (23%)

**Average Score**: 6.2 / 10

Provides scoring overview and statistics.

PDF Analysis Results

markdown

## Stage 3: PDF Analysis

**Model**: Claude Opus 4.1
**Duration**: 28 minutes 15 seconds
**Cost**: $3.20

**Papers Analyzed**: 20 / 30
**Success Rate**: 100%
**Average PDF Size**: 2.3 MB

Shows deep analysis statistics.

Paper Details

Each paper gets a detailed entry:

markdown

### 1. Novel Attention Mechanism for Transformers

**Score**: 9.2 / 10
**Authors**: Smith et al.
**arXiv**: [2410.12345](https://arxiv.org/abs/2410.12345)
**PDF**: [Download](https://arxiv.org/pdf/2410.12345.pdf)

**Abstract:**
We propose a new attention mechanism that reduces computational
complexity from O(n²) to O(n log n) while maintaining performance...

**Relevance Justification:**
Highly relevant. Addresses key challenge in transformer scaling
with novel approach. Strong empirical results. Builds on recent
work in efficient attention.

**PDF Analysis:**
The paper introduces "Sparse Hierarchical Attention" (SHA) which...

**Key Contributions:**

- Reduces attention complexity to O(n log n)
- Maintains accuracy on standard benchmarks
- Provides theoretical analysis of approximation quality

**Methodology:**

- Hierarchical clustering of tokens
- Sparse attention patterns
- Gradient-based importance sampling

**Results:**

- 3x faster training on long sequences
- Comparable accuracy to full attention
- Scales to 100K token sequences

**Limitations:**

- Requires tuning of sparsity hyperparameters
- Limited evaluation on generation tasks

**Future Directions:**

- Extension to decoder-only models
- Application to multi-modal learning

Includes:

Title and metadata
Abstract and relevance score
PDF analysis (if performed)
Key contributions
Methodology summary
Results and limitations

markdown

## Analysis Summary

**Total Papers**: 47
**Papers Scored**: 30
**Papers with PDF Analysis**: 20

**Total Duration**: 45 minutes 11 seconds
**Total Cost**: $4.73

**Model Breakdown:**

- Quick Filter (Haiku 4.5): $0.08
- Abstract Scoring (Sonnet 4.5): $1.45
- PDF Analysis (Opus 4.1): $3.20

---

_Generated by Aparture - AI-powered research paper discovery_

Provides complete cost and timing breakdown.

NotebookLM Document Structure

The NotebookLM document uses a different structure optimized for audio:

Conversational Format

markdown

# Research Highlights: Computer Science (cs.LG, cs.AI)

## Overview

Today's analysis covered 47 papers in machine learning and AI,
with several exciting developments in transformer architectures
and Bayesian inference methods.

## Major Themes

### Transformer Efficiency

There's significant progress in making transformers more efficient.
Smith et al. introduce "Sparse Hierarchical Attention"...

### Bayesian Methods

Several papers explore Bayesian approaches to deep learning.
The most interesting is Jones et al.'s work on...

Key differences:

Narrative style - Flows like a conversation
Thematic organization - Groups related papers
Synthesis - Connects ideas across papers
Audio-friendly - Short sentences, clear structure

Structured Sections

markdown

## Deep Dive: Sparse Hierarchical Attention

This paper by Smith et al. tackles a fundamental challenge:
transformers are computationally expensive on long sequences.

**The Problem**: Standard attention is O(n²), making it
prohibitive for documents longer than a few thousand tokens.

**The Solution**: Hierarchical clustering creates sparse
attention patterns that approximate full attention.

**The Impact**: 3x faster training with minimal accuracy loss.

**Why It Matters**: Opens the door to processing much longer
documents and could enable new applications in...

Provides deep dives on top papers with context and implications.

Working with Reports

Opening Reports

Markdown viewers:

VS Code - Built-in preview (Ctrl/Cmd + Shift + V)
Obsidian - Rich markdown experience
Typora - WYSIWYG markdown editor
GitHub - Upload for web viewing

Convert to other formats:

PDF: Use pandoc or markdown-to-pdf tools
HTML: Use marked or static site generators
Word: Use pandoc with DOCX output

Searching Reports

Find papers by topic:

bash

grep -i "bayesian" reports/2025-10-14_arxiv_analysis_45min.md

Find high-scoring papers:

bash

grep "Score: [89]" reports/2025-10-14_arxiv_analysis_45min.md

Extract arXiv IDs:

bash

grep -o "arxiv.org/abs/[0-9.]*" reports/*.md

Organizing Reports

By date:

reports/
  2025-10-14_arxiv_analysis_45min.md
  2025-10-15_arxiv_analysis_52min.md
  2025-10-16_arxiv_analysis_38min.md

By topic (manual):

reports/
  machine-learning/
    2025-10-14_arxiv_analysis_45min.md
  astrophysics/
    2025-10-15_arxiv_analysis_52min.md

Archive old reports:

bash

mkdir reports/archive
mv reports/2025-09-* reports/archive/

Report Quality

What to Expect

Good Reports:

✅ Consistent scoring across similar papers
✅ Detailed justifications for scores
✅ Comprehensive PDF analyses
✅ Clear executive summary
✅ Actionable recommendations

Common Issues:

⚠️ Score inflation (all papers 7-9)
⚠️ Generic justifications
⚠️ Missing PDF analyses (download failures)
⚠️ Inconsistent formatting

Improving Quality

Better research criteria:

Be specific about interests
Mention concrete techniques
Provide example topics
Update regularly based on results

Better model selection:

Use Opus 4.1/GPT-5 for scoring (higher quality)
Enable post-processing for consistency
Use Haiku/Nano only for quick filter

Better configuration:

Select focused categories
Adjust score thresholds
Limit PDF analysis to top papers
Review and iterate

See Multi-Stage Analysis for optimization tips.

Within Teams

Markdown format advantages:

Version control friendly (Git)
Easy to diff and merge
Readable in any text editor
GitHub renders nicely

Collaborative workflows:

Commit to shared repository
Use pull requests for review
Track changes over time
Search across all reports

Before sharing publicly:

⚠️ Remove any internal notes
⚠️ Check for sensitive information
⚠️ Verify arXiv links work
✓ Add context for external readers

Publishing options:

GitHub gists
Personal blog/website
Research group webpage
arXiv "daily picks" lists

Research log

Commit reports to Git and track your research interests over time. Great for identifying trends and documenting your learning journey.

Reports & Outputs

Output Files

Analysis Report

NotebookLM Document

Audio Podcast

Analysis Report Structure

Header Section

Executive Summary

Stage Results

Quick Filter Results

Abstract Scoring Results

PDF Analysis Results

Paper Details

Footer Section

NotebookLM Document Structure

Conversational Format

Structured Sections

Working with Reports

Opening Reports

Searching Reports

Organizing Reports

Report Quality

What to Expect

Improving Quality

Within Teams

Next Steps

Reports & Outputs ​

Output Files ​

Analysis Report ​

NotebookLM Document ​

Audio Podcast ​

Analysis Report Structure ​

Header Section ​

Executive Summary ​

Stage Results ​

Quick Filter Results ​

Abstract Scoring Results ​

PDF Analysis Results ​

Paper Details ​

Footer Section ​

NotebookLM Document Structure ​

Conversational Format ​

Structured Sections ​

Working with Reports ​

Opening Reports ​

Searching Reports ​

Organizing Reports ​

Report Quality ​

What to Expect ​

Improving Quality ​

Sharing Reports ​

Within Teams ​

Public Sharing ​

Next Steps ​

Reports & Outputs

Output Files

Analysis Report

NotebookLM Document

Audio Podcast

Analysis Report Structure

Header Section

Executive Summary

Stage Results

Quick Filter Results

Abstract Scoring Results

PDF Analysis Results

Paper Details

Footer Section

NotebookLM Document Structure

Conversational Format

Structured Sections

Working with Reports

Opening Reports

Searching Reports

Organizing Reports

Report Quality

What to Expect

Improving Quality

Sharing Reports

Within Teams

Public Sharing

Next Steps