Skip to content

Commit f9a60d3

Browse files
authored
fix: refine the prompt (microsoft#286)
1 parent 954270a commit f9a60d3

1 file changed

Lines changed: 2 additions & 1 deletion

File tree

rdagent/scenarios/kaggle/experiment/prompts.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ kg_feature_interface: |-
114114
3. Ensure consistency in column count across train, validation, and test sets post-feature engineering. For example, fit PCA on the training set and apply the same transformation to validation and test sets to keep the number of columns aligned, and use OneHotEncoder may also cause different number of columns.
115115
4. Ensure that the generation of new features does not drastically increase the number of columns, which can slow down data processing. For example, avoid creating pairwise interactions for all features, as this would lead to a quadratic increase in the number of columns.
116116
5. Avoids raising a `ValueError` or any other exceptions that could interrupt the main program's flow. The code should not include checks that could potentially lead to a `ValueError`. Instead, focus on writing robust and fault-tolerant feature engineering functions that handle edge cases and missing data gracefully, without stopping the program.
117+
6. Specific categories of features can be filtered, and processing can be applied to those categories. For example, normalization can be applied to float-type features, but such processing should not be done on one-hot encoded features.
117118
118119
kg_model_interface: |-
119120
Your code should contain several parts:
@@ -312,4 +313,4 @@ kg_model_output_format: |-
312313
313314
kg_model_simulator: |-
314315
The models will be trained on the competition dataset and evaluated on their ability to predict the target. Metrics like accuracy and AUC-ROC is used to evaluate the model performance.
315-
Model performance will be iteratively improved based on feedback from evaluation results.
316+
Model performance will be iteratively improved based on feedback from evaluation results.

0 commit comments

Comments
 (0)