Rodeo Documentation¶
Rodeo is a graph database ETL pipeline tool that simplifies the process of loading data from CSV and JSON files into Neo4j.
Overview¶
Rodeo provides a declarative approach to data transformation and loading. Rather than writing custom scripts for each data import, you define mapping files that describe how your source data should be transformed into graph nodes and relationships. Rodeo handles the rest including type conversion, conditional logic, merge operations, and index management.
Key Features¶
- Declarative mappings: Define transformations in YAML without writing code
- Flexible field handling: Map source fields to node and relationship attributes with optional type conversion
- Conditional node creation: Create nodes only when specific conditions are met
- Merge support: Update existing nodes and relationships instead of creating duplicates
- Iterator support: Expand list fields into multiple nodes automatically
- Background processing: Run large imports in the background and monitor progress
- Index management: Define indexes in your mapping files for optimal query performance
Quick Start¶
- Install Rodeo using the appropriate package for your system (RPM or DEB)
- Run the setup script located at
/opt/rodeo/setup_user_service.sh - Create a configuration file with your Neo4j connection details
- Create a mapping file that defines your data transformations
- Execute your pipeline:
Getting Help¶
- Run
rodeo --helpfor general usage information - Run
rodeo --versionfor CLI version information - Run
rodeo versionfor service version information
License¶
Rodeo is proprietary software. Refer to the End User License Agreement (EULA) that must be confirmed on first run or when the EULA is updated.