Looking for a working example Colab/Notebook showing training or fine-tuning of a text generation model capable of converting "short text" -> "programming code text".
I'm learning the topic and would like to fine-tune it with a custom metric on some public GitHub repos.
All I found so far are models that "continue a sentence" or simply generate the text out of the blue. Many thanks!
CodePudding user response:
First, You can see CodeXGLUE and their repository, we have four categories:
- code-code (clone detection, defect detection, cloze test, code completion, code repair, and code-to-code translation)
- text-code (natural language code search, text-to-code generation)
- code-text (code summarization)
- text-text (documentation translation)
You want text-to-code generation task
. Base benchmark on CodeXGLUE, one of the best models for this task is CoTexT
. CoTexT
support these programming languages : "go" ,"java", "javascript", "php", "python", "ruby". You can find the pre-trained of this model on huggingface from here
and explaining about how to fine-tune this here
.