snorkelflow.client.fm_suite.prompt_fm_over_dataset
- snorkelflow.client.fm_suite.prompt_fm_over_dataset(prompt_template, dataset, x_uids, model_name, model_type=None, runs_per_prompt=1, sync=True, cache_name='default', system_prompt=None, **fm_hyperparameters)
Run a prompt over a dataset. Any field in the dataset can be referenced in the prompt by using curly braces, {field_name}.
Parameters
Parameters
Return type
Return type
Union
[DataFrame
,str
]Returns
Returns
df – Dataframe containing the predictions for the data points. There are two columns, the input prompt and the output of the foundation model.
job_id – The job id of the prompt inference job which can be used to monitor progress with sf.poll_job_status(job_id).
Name Type Default Info prompt_template str
The prompt template used to format input rows sent to the foundation model. dataset Union[str, int]
The name or UID of the dataset containing the data we want to prompt over. x_uids List[str]
The x_uids of the rows within the dataset to prompt over. model_name str
The name of the foundation model to use. model_type Optional[LLMType]
None
The way we should use the foundation model, must be one of the LLMType values. runs_per_prompt int
1
The number of times to run inference over an xuid, note each response can be different. All will be cached. sync bool
True
Whether to wait for the job to complete before returning the result. cache_name str
'default'
The cache name is used in the hash construction. To run a prompt and get a different result, you should change the cache name to something that hasn’t been used before. For example: >> sf.prompt_fm(“What is the meaning of life?”, “openai/gpt-4o”) The meaning of life is to work… >> sf.prompt_fm(“What is the meaning of life?”, “openai/gpt-4o”) << hit’s the cache The meaning of life is to work… >> sf.prompt_fm(“What is the meaning of life?”, “openai/gpt-4o”, cache_name=”run_2”) << hit’s a different part of the cache The meaning of life is to have fun!. system_prompt Optional[str]
None
The system prompt to prepend to the prompt. fm_hyperparameters Any
Additional keyword arguments to pass to the foundation model such as temperature, max_tokens, etc. Examples
>>> sf.prompt_fm_over_dataset(prompt_template="{email_subject}. What is this email about?", dataset=1, x_uids=["0", "1"], model_name="openai/gpt-4")
| email_subject | generated_text | perplexity
-----------------------------------------------------------------------------------------------------------------------------------
0 | Fill in survey for $50 amazon voucher | The email is asking you to fill in a survey for an amazon voucher | 0.891
1 | Hey it's Bob, free on Sat? | The email is from your friend Bob as if you're free on Saturday | 0.787