Gorilla is a retrieve-aware finetuned LLaMA-7B model that mitigates the issue of hallucination when prompting LLMs directly. It integrates with APIBench, a dataset of APIs from HuggingFace, TorchHub, and TensorHub, to increase the reliability and applicability of LLM outputs. Data is collected from these sources, filtered and processed into a structured format, and expanded using GPT-4 to generate synthetic instruction data for each API.
