.. _testing: ================================================================================ Testing Guide ================================================================================ How to test poisson-topicmodels code and write tests for your own code. Running Tests ============= Run all tests: .. code-block:: bash pytest tests/ Run specific test file: .. code-block:: bash pytest tests/test_pf.py Run with coverage: .. code-block:: bash pytest tests/ --cov=poisson_topicmodels Test Categories =============== **Unit Tests**: Function/class behavior - Located in: `tests/test_*.py` - Fast, focused - Test individual components **Integration Tests**: Multiple components together - Located in: `tests/test_integration.py` - Slower, broader scope - Test workflows **Comprehensive Tests**: All models and features - Located in: `tests/test_models_comprehensive.py` - Slowest - Test all variations Test Coverage ============= Current coverage: >70% View coverage report: .. code-block:: bash pytest tests/ --cov=poisson_topicmodels --cov-report=html open htmlcov/index.html # View in browser Writing Tests ============= Test structure: .. code-block:: python import pytest import numpy as np from scipy.sparse import csr_matrix from poisson_topicmodels import PF class TestPF: """Tests for Poisson Factorization model.""" @pytest.fixture def sample_data(self): """Create sample data for testing.""" counts = csr_matrix( np.random.poisson(2, (50, 100)).astype(np.float32) ) vocab = np.array([f'word_{i}' for i in range(100)]) return counts, vocab def test_initialization(self, sample_data): """Test model initializes correctly.""" counts, vocab = sample_data model = PF(counts, vocab, num_topics=10) assert model.num_topics == 10 assert model.vocab.shape[0] == 100 def test_training(self, sample_data): """Test model trains successfully.""" counts, vocab = sample_data model = PF(counts, vocab, num_topics=5, batch_size=16) params = model.train_step(num_steps=10, lr=0.01) assert params is not None def test_return_topics(self, sample_data): """Test topic extraction.""" counts, vocab = sample_data model = PF(counts, vocab, num_topics=5, batch_size=16) model.train_step(num_steps=10, lr=0.01) categories, e_theta = model.return_topics() assert e_theta.shape == (50, 5) Continuous Integration ====================== Tests run automatically on: - Every push to main/develop - Every pull request - Scheduled nightly runs Via GitHub Actions: - Python 3.11, 3.12, 3.13 - CPU and GPU (if available) - Multiple OS: Linux, macOS, Windows View results on GitHub Actions tab. Testing Best Practices ====================== **For users**: - Run tests after installation: ``pytest tests/test_imports.py`` - Before reporting bugs: run full test suite - Document any test failures **For developers**: - Write tests for new features - Aim for >80% code coverage - Test edge cases and error conditions - Document test purpose Test Organization ================= .. list-table:: :widths: 35 65 :header-rows: 1 * - File - Purpose * - test_imports.py - Check packages import correctly * - test_input_validation.py - Validate input handling * - test_pf.py - Poisson Factorization tests * - test_spf.py - Seeded models tests * - test_integration.py - End-to-end workflows * - test_models_comprehensive.py - All models, all variants * - test_training_and_outputs.py - Training process & outputs Common Test Patterns ==================== **Checking shape**: .. code-block:: python categories, e_theta = model.return_topics() assert e_theta.shape == (num_docs, num_topics) **Checking values**: .. code-block:: python beta = model.return_beta() assert (beta.values >= 0).all() # Non-negative **Checking errors**: .. code-block:: python with pytest.raises(ValueError): model = PF(invalid_counts, vocab, num_topics=5) **Parametrized tests**: .. code-block:: python @pytest.mark.parametrize("num_topics", [5, 10, 20]) def test_different_topics(self, sample_data, num_topics): """Test with different numbers of topics.""" counts, vocab = sample_data model = PF(counts, vocab, num_topics=num_topics, batch_size=16) model.train_step(num_steps=5, lr=0.01) _, e_theta = model.return_topics() assert e_theta.shape[1] == num_topics Debugging Tests =============== Run with verbose output: .. code-block:: bash pytest tests/ -v # Verbose pytest tests/ -vv # Very verbose pytest tests/ -s # Show print statements Stop at first failure: .. code-block:: bash pytest tests/ -x Debug with pdb: .. code-block:: bash pytest tests/ --pdb # Drop to debugger on failure Performance Testing =================== Time specific tests: .. code-block:: bash pytest tests/ --durations=10 # 10 slowest tests Mark slow tests: .. code-block:: python @pytest.mark.slow def test_large_model(self): """This test is slow.""" ... Run only fast tests: .. code-block:: bash pytest tests/ -m "not slow" Testing GPU Code ================ Tests detect GPU automatically: .. code-block:: python import jax def test_with_gpu_if_available(): if not jax.devices(): pytest.skip("GPU not available") # Test GPU-specific code ... Or mark GPU tests: .. code-block:: python @pytest.mark.gpu def test_gpu_training(self): """Only run on GPU.""" ... Run GPU tests: .. code-block:: bash pytest tests/ -m gpu Troubleshooting Tests ===================== **Tests fail on import**: Install development dependencies: .. code-block:: bash pip install -e ".[dev]" **Random test failures**: Set seed for reproducibility: .. code-block:: python np.random.seed(42) **Memory issues during testing**: Run with smaller datasets or skip large tests: .. code-block:: bash pytest tests/ -m "not large" **Tests hang**: Add timeout: .. code-block:: bash pytest tests/ --timeout=60 # 60 second timeout per test Contributing Tests ================== When submitting code: 1. Write tests for new functionality 2. Ensure all tests pass 3. Maintain or increase coverage (aim for >70%) 4. Document test purpose See :doc:`../contributing_guide/index` for full guidelines. Test Configuration ================== Tests configured in: - ``pytest.ini`` – Pytest settings - ``conftest.py`` – Shared fixtures - ``.github/workflows/`` – CI configuration Examples from conftest: .. code-block:: python @pytest.fixture def random_seed(): """Ensure reproducibility.""" np.random.seed(42) return 42 @pytest.fixture def sample_dtm(): """Reusable sample data.""" counts = csr_matrix( np.random.poisson(2, (100, 500)).astype(np.float32) ) vocab = np.array([f'word_{i}' for i in range(500)]) return counts, vocab Testing Progress ================ Latest test results: - Total tests: 150+ - Coverage: >70% - CI status: Passing on main Contribution Checklist ====================== When adding code: ✓ Write unit tests ✓ Write integration tests ✓ All tests pass locally ✓ Coverage doesn't decrease ✓ Document tested behavior ✓ Test on Python 3.11, 3.12, 3.13 Next Steps ========== - **Run tests**: ``pytest tests/`` - **Write tests**: Follow patterns above - **CI/CD**: Automated via GitHub Actions - **Contribute tests**: See :doc:`../contributing_guide/index`