Raw Data to Clean Dataset

Two-step chain: assess data quality issues, then execute a cleaning and transformation plan.

Category: data Difficulty: beginner
Platforms: chatgpt claude
Tags: data-cleaning data-quality assessment preprocessing chain

Prompt Template

You are a data quality analyst. Assess the quality of this dataset.

Dataset description: {{dataset}}
Columns/fields: {{columns}}
Sample data or issues noticed: {{sample}}
Intended use: {{intended_use}}

## Data Quality Assessment

### Completeness
| Column | Missing Count | Missing % | Impact on Analysis |

### Accuracy
- Data type mismatches found:
- Invalid values found:
- Range violations:

### Consistency
- Duplicate rows:
- Conflicting records:
- Format inconsistencies (dates, names, codes):

### Timeliness
- Data freshness:
- Stale records:

## Quality Score
| Dimension | Score (1-10) | Critical Issues |
| Completeness | | |
| Accuracy | | |
| Consistency | | |
| Overall | | |

## Cleaning Priority List
| Priority | Issue | Affected Rows | Recommended Action | Complexity |

Tips