Self-Instruct

GitHub - yizhongw/self-instruct: Aligning pretrained language models with instruction data generated by themselves.

  • Limited seed set of manually written tasks to guide overall generation
  • First Phase
    • prompt model to generate instructions for new tasks
  • Second Phase
    • create input-output instances for the instructions that will be used for supervising the instruction tuning
  • Final Phase
    • heuristics to automatically filter low-quality or repeated instructions

Want to have a farily small overlap with the seed tasks.

The instruction tuned model outperforms the vanilla model by a large margin.

Human Evaluation shows broad abilities.

Post Processing

To encourage diversity, only add instruction when its ROUGE-L similarity with existing instructions is less than 0.7. Exclude instructions with specific keywords. Exclude invalid generations based on heuristics (instructions too long or too short, repetitions, etc.).