Testing Guide

How to test poisson-topicmodels code and write tests for your own code.

Running Tests

Run all tests:

pytest tests/

Run specific test file:

pytest tests/test_pf.py

Run with coverage:

pytest tests/ --cov=poisson_topicmodels

Test Categories

Unit Tests: Function/class behavior

  • Located in: tests/test_*.py

  • Fast, focused

  • Test individual components

Integration Tests: Multiple components together

  • Located in: tests/test_integration.py

  • Slower, broader scope

  • Test workflows

Comprehensive Tests: All models and features

  • Located in: tests/test_models_comprehensive.py

  • Slowest

  • Test all variations

Test Coverage

Current coverage: >70%

View coverage report:

pytest tests/ --cov=poisson_topicmodels --cov-report=html
open htmlcov/index.html  # View in browser

Writing Tests

Test structure:

import pytest
import numpy as np
from scipy.sparse import csr_matrix
from poisson_topicmodels import PF

class TestPF:
    """Tests for Poisson Factorization model."""

    @pytest.fixture
    def sample_data(self):
        """Create sample data for testing."""
        counts = csr_matrix(
            np.random.poisson(2, (50, 100)).astype(np.float32)
        )
        vocab = np.array([f'word_{i}' for i in range(100)])
        return counts, vocab

    def test_initialization(self, sample_data):
        """Test model initializes correctly."""
        counts, vocab = sample_data
        model = PF(counts, vocab, num_topics=10)
        assert model.num_topics == 10
        assert model.vocab.shape[0] == 100

    def test_training(self, sample_data):
        """Test model trains successfully."""
        counts, vocab = sample_data
        model = PF(counts, vocab, num_topics=5, batch_size=16)
        params = model.train_step(num_steps=10, lr=0.01)
        assert params is not None

    def test_return_topics(self, sample_data):
        """Test topic extraction."""
        counts, vocab = sample_data
        model = PF(counts, vocab, num_topics=5, batch_size=16)
        model.train_step(num_steps=10, lr=0.01)
        categories, e_theta = model.return_topics()
        assert e_theta.shape == (50, 5)

Continuous Integration

Tests run automatically on:

  • Every push to main/develop

  • Every pull request

  • Scheduled nightly runs

Via GitHub Actions:

  • Python 3.11, 3.12, 3.13

  • CPU and GPU (if available)

  • Multiple OS: Linux, macOS, Windows

View results on GitHub Actions tab.

Testing Best Practices

For users:

  • Run tests after installation: pytest tests/test_imports.py

  • Before reporting bugs: run full test suite

  • Document any test failures

For developers:

  • Write tests for new features

  • Aim for >80% code coverage

  • Test edge cases and error conditions

  • Document test purpose

Test Organization

File

Purpose

test_imports.py

Check packages import correctly

test_input_validation.py

Validate input handling

test_pf.py

Poisson Factorization tests

test_spf.py

Seeded models tests

test_integration.py

End-to-end workflows

test_models_comprehensive.py

All models, all variants

test_training_and_outputs.py

Training process & outputs

Common Test Patterns

Checking shape:

categories, e_theta = model.return_topics()
assert e_theta.shape == (num_docs, num_topics)

Checking values:

beta = model.return_beta()
assert (beta.values >= 0).all()  # Non-negative

Checking errors:

with pytest.raises(ValueError):
    model = PF(invalid_counts, vocab, num_topics=5)

Parametrized tests:

@pytest.mark.parametrize("num_topics", [5, 10, 20])
def test_different_topics(self, sample_data, num_topics):
    """Test with different numbers of topics."""
    counts, vocab = sample_data
    model = PF(counts, vocab, num_topics=num_topics, batch_size=16)
    model.train_step(num_steps=5, lr=0.01)
    _, e_theta = model.return_topics()
    assert e_theta.shape[1] == num_topics

Debugging Tests

Run with verbose output:

pytest tests/ -v  # Verbose
pytest tests/ -vv # Very verbose
pytest tests/ -s  # Show print statements

Stop at first failure:

pytest tests/ -x

Debug with pdb:

pytest tests/ --pdb  # Drop to debugger on failure

Performance Testing

Time specific tests:

pytest tests/ --durations=10  # 10 slowest tests

Mark slow tests:

@pytest.mark.slow
def test_large_model(self):
    """This test is slow."""
    ...

Run only fast tests:

pytest tests/ -m "not slow"

Testing GPU Code

Tests detect GPU automatically:

import jax

def test_with_gpu_if_available():
    if not jax.devices():
        pytest.skip("GPU not available")

    # Test GPU-specific code
    ...

Or mark GPU tests:

@pytest.mark.gpu
def test_gpu_training(self):
    """Only run on GPU."""
    ...

Run GPU tests:

pytest tests/ -m gpu

Troubleshooting Tests

Tests fail on import:

Install development dependencies:

pip install -e ".[dev]"

Random test failures:

Set seed for reproducibility:

np.random.seed(42)

Memory issues during testing:

Run with smaller datasets or skip large tests:

pytest tests/ -m "not large"

Tests hang:

Add timeout:

pytest tests/ --timeout=60  # 60 second timeout per test

Contributing Tests

When submitting code:

  1. Write tests for new functionality

  2. Ensure all tests pass

  3. Maintain or increase coverage (aim for >70%)

  4. Document test purpose

See Contributing Guide for full guidelines.

Test Configuration

Tests configured in:

  • pytest.ini – Pytest settings

  • conftest.py – Shared fixtures

  • .github/workflows/ – CI configuration

Examples from conftest:

@pytest.fixture
def random_seed():
    """Ensure reproducibility."""
    np.random.seed(42)
    return 42

@pytest.fixture
def sample_dtm():
    """Reusable sample data."""
    counts = csr_matrix(
        np.random.poisson(2, (100, 500)).astype(np.float32)
    )
    vocab = np.array([f'word_{i}' for i in range(500)])
    return counts, vocab

Testing Progress

Latest test results:

  • Total tests: 150+

  • Coverage: >70%

  • CI status: Passing on main

Contribution Checklist

When adding code:

✓ Write unit tests ✓ Write integration tests ✓ All tests pass locally ✓ Coverage doesn’t decrease ✓ Document tested behavior ✓ Test on Python 3.11, 3.12, 3.13

Next Steps

  • Run tests: pytest tests/

  • Write tests: Follow patterns above

  • CI/CD: Automated via GitHub Actions

  • Contribute tests: See Contributing Guide