Use Case: Dataset Management
Organize datasets by date for easier management and analysis.
Overview
Datasets often accumulate over time without proper organization. This use case demonstrates how to use fx organize to structure datasets by date, making them easier to find, analyze, and manage.
Dataset Organization Workflow
1. Preview Organization Plan
Always preview before organizing:
# Preview organization with dry-run
fx organize ~/Data --dry-run
# Output shows plan without moving files
# Review to ensure correct behavior
2. Organize by Creation Date
Organize datasets by creation time:
# Basic organization
fx organize ~/Data
# Output directory: ~/Data/organized/
# Structure: organized/2026/202601/20260110/
3. Filter by File Type
Organize only specific dataset types:
# Organize only CSV files
fx organize ~/Data -i "*.csv" --recursive
# Organize only JSON datasets
fx organize ~/Data -i "*.json" --recursive
# Organize multiple types
fx organize ~/Data -i "*.csv" -i "*.json" -i "*.parquet" --recursive
4. Custom Depth and Output
Configure organization structure:
# Use 2-level depth (year/day)
fx organize ~/Data --depth 2 -o ~/Data/Organized
# Use 1-level depth (day only)
fx organize ~/Data --depth 1 -o ~/Data/Simple
5. Clean Up Empty Directories
Remove empty source directories after organization:
# Organize and clean empty directories
fx organize ~/Data --recursive --clean-empty
Real-World Scenarios
Scenario 1: Research Data Organization
Organize research datasets by date:
# Preview organization
fx organize ~/Research/Data --dry-run -o ~/Research/Organized
# Execute organization
fx organize ~/Research/Data --recursive -o ~/Research/Organized
Scenario 2: ML Training Data
Organize machine learning training data:
# Organize by file type
fx organize ~/ML/Data -i "*.csv" -i "*.parquet" -i "*.json" --recursive
# Use creation time
fx organize ~/ML/Data --date-source created --recursive
Scenario 3: Sensor Data Management
Organize time-series sensor data:
# Organize sensor data by date
fx organize ~/Sensors --recursive --date-source modified
# Clean up empty directories
fx organize ~/Sensors --recursive --clean-empty
Scenario 4: Archive Management
Organize archived datasets:
# Organize archives with custom depth
fx organize ~/Archives --depth 2 -o ~/Archives/Sorted
# Use specific patterns
fx organize ~/Archives -i "*.tar" -i "*.zip" -i "*.gz" --recursive
Dataset Organization Script
#!/bin/bash
# organize_datasets.sh
set -e
DATA_DIR=$1
if [ -z "$DATA_DIR" ]; then
echo "Usage: organize_datasets.sh DATA_DIR"
echo "Example: organize_datasets.sh ~/Data"
exit 1
fi
echo "=== Dataset Organization ==="
echo "Source: $DATA_DIR"
echo ""
# 1. Preview organization
echo "1. Previewing organization plan..."
fx organize "$DATA_DIR" --dry-run --recursive
echo ""
# 2. Confirm organization
read -p "Proceed with organization? (y/n) " -n 1 -r
echo ""
if [[ $REPLY =~ ^[Yy]$ ]]; then
echo "2. Organizing datasets..."
# Execute organization
fx organize "$DATA_DIR" --recursive --clean-empty --yes --quiet
echo " Organization complete"
echo ""
else
echo "2. Organization cancelled"
echo ""
exit 0
fi
# 3. Summary
echo "3. Summary..."
OUTPUT_DIR="$DATA_DIR/organized"
echo " Output directory: $OUTPUT_DIR"
if [ -d "$OUTPUT_DIR" ]; then
FILE_COUNT=$(find "$OUTPUT_DIR" -type f | wc -l)
echo " Files organized: $FILE_COUNT"
fi
echo "=== Organization Complete ==="
Advanced Dataset Management
Organize by Data Type
# Organize CSVs
fx organize ~/Data -i "*.csv" -o ~/Data/CSV
# Organize JSON
fx organize ~/Data -i "*.json" -o ~/Data/JSON
# Organize Parquet
fx organize ~/Data -i "*.parquet" -o ~/Data/Parquet
Conflict Handling
# Skip conflicting datasets
fx organize ~/Data --on-conflict skip --recursive
# Rename conflicting datasets
fx organize ~/Data --on-conflict rename --recursive
# Overwrite old datasets (use carefully)
fx organize ~/Data --on-conflict overwrite --recursive
Verification
# Verify organization
echo "Organized datasets:"
find ~/Data/organized -type f -name "*.csv" | head -20
# Check for missing datasets
echo "Unorganized datasets:"
find ~/Data -maxdepth 1 -type f -name "*.csv" 2>/dev/null
Related Commands
fx organize- Organize files by datefx filter- Filter files by extensionfx size- Analyze file sizes