Testing¶
The Data Packet includes a comprehensive testing suite to ensure reliability and maintainability.
Test Structure¶
The test suite is organized into several modules, each focusing on a specific component:
Core Tests:
- test_core/test_config.py: Configuration management tests
- test_core/test_exceptions.py: Exception handling tests
- test_core/test_logging.py: Logging system tests
Generation Tests:
- test_generation/test_script.py: AI script generation tests
- test_generation/test_audio.py: TTS audio generation tests
- test_generation/test_rss.py: RSS feed generation tests
Sources Tests:
- test_sources/test_base.py: Base article source tests
- test_sources/test_wired.py: Wired.com source tests
- test_sources/test_techcrunch.py: TechCrunch source tests
Workflow Tests:
- test_workflows/test_podcast.py: End-to-end pipeline tests
Utilities Tests:
- test_utils/test_s3.py: AWS S3 storage tests
- test_cli.py: Command-line interface tests
Test Framework¶
The package uses Python’s built-in unittest framework with the following patterns:
Test Classes: Each component has a dedicated test class inheriting from
unittest.TestCaseSetup Methods:
setUp()methods create mock dependencies and test dataMock Objects: Extensive use of
unittest.mockfor isolating componentsAssertions: Standard unittest assertions (
assertEqual,assertIn, etc.)
Running Tests¶
Using Hatch (Recommended)¶
# Run all tests
pytest tests/ -v
# Run specific test module
pytest tests/test_core/ -v
# Run with coverage
pytest --cov=the_data_packet tests/
Using Python Directly¶
# Run all tests
python -m unittest discover tests -v
# Run specific test
python -m unittest tests.test_core.test_config.TestConfig.test_initialization -v
Test Coverage¶
The test suite achieves comprehensive coverage:
231 Total Tests: Covering all major functionality with 100% pass rate
Core: 45+ tests covering configuration, exceptions, and logging
Generation: 60+ tests covering AI script generation, TTS audio, and RSS feeds
Sources: 50+ tests covering article collection from multiple news sources
Workflows: 35+ tests covering end-to-end pipeline orchestration
Utilities: 25+ tests covering S3 storage and HTTP handling
CLI: 15+ tests covering command-line interface and argument parsing
Key Testing Patterns¶
Mock Usage¶
Tests extensively use mocking to isolate components:
def setUp(self):
self.scraper = WiredArticleScraper()
self.scraper.rss_client = Mock()
self.scraper.http_client = Mock()
self.scraper.extractor = Mock()
# Configure mock returns
self.scraper.rss_client.get_latest_article_url.return_value = "https://example.com"
self.scraper.extractor.extract.return_value = self.sample_article_data
Error Testing¶
Error conditions are thoroughly tested:
def test_http_error_handling(self):
self.client.session.get.side_effect = requests.exceptions.Timeout()
with self.assertRaises(RuntimeError) as cm:
self.client.get_page("https://example.com")
self.assertIn("timeout", str(cm.exception).lower())
Sample Data¶
Shared test data is defined in conftest.py:
SAMPLE_ARTICLE_HTML = """
<!DOCTYPE html>
<html>
<head><title>Test Article | WIRED</title></head>
<body>
<article><p>Test content</p></article>
</body>
</html>
"""
Test Categories¶
Unit Tests¶
Models: Data validation, serialization, edge cases
Clients: HTTP requests, RSS parsing, timeout handling
Extractors: HTML parsing, fallback methods, malformed content
Individual Methods: Each public method has dedicated tests
Integration Tests¶
End-to-End Flows: Complete scraping workflows
Component Integration: How modules work together
Real-world Scenarios: Handling actual webpage structures
CLI Tests¶
Argument Parsing: Valid and invalid command combinations
Output Formatting: JSON and text output verification
Error Handling: Network errors, invalid URLs
Process Management: Signal handling, exit codes
Best Practices¶
Writing New Tests¶
When adding new functionality:
Create Test First: Write tests before implementation (TDD)
Use Descriptive Names: Test method names should describe the scenario
Test Edge Cases: Include error conditions and boundary values
Mock External Dependencies: Isolate the code under test
Verify All Paths: Test both success and failure scenarios
Example Test Structure:
class TestNewFeature(unittest.TestCase):
def setUp(self):
"""Set up test fixtures."""
# Create mocks and test data
def test_successful_operation(self):
"""Test normal operation with valid input."""
# Arrange, Act, Assert
def test_error_handling(self):
"""Test error handling with invalid input."""
# Test exception scenarios
def test_edge_cases(self):
"""Test boundary conditions."""
# Test limits and edge cases
Continuous Integration¶
Tests are automatically run on:
Pre-commit: Before code commits
Pull Requests: Before merging changes
Release Builds: Before creating packages
The CI pipeline includes:
Type Checking: mypy validation
Code Formatting: black and isort verification
Test Execution: Full test suite
Coverage Reporting: Minimum coverage thresholds