Introduction to Self-Consistency in LLMs
Self-consistency is an advanced prompting technique that builds on COT prompting. The aim here is to improve the naive greedy decoding using COT prompting by sampling multiple diverse reasoning paths and selecting the most consistent answers.
This can help boost the performance of COT prompting on tasks involving arithmetic and common sense reasoning. By utilizing a majority voting system, the AI model can arrive at more accurate and reliable answers.
Self-consistency involves providing the AI model with multiple reasoning paths or diverse perspectives and then selecting the most consistent and coherent answer among the generated responses. This technique not only helps to reduce biases in the AI's responses but also encourages it to consider various viewpoints before arriving at a conclusion.
Understanding Self-Consistency
In real-life scenarios, when faced with a problem, humans often explore different reasoning paths or consult multiple sources to arrive at a well-informed decision. Similarly, self-consistency aims to simulate this process by providing the model with diverse perspectives and encouraging it to critically evaluate its own reasoning. By doing so, it increases the likelihood of generating an accurate and unbiased response.
How Self-Consistency Works
To implement self-consistency, prompt engineers typically follow these steps:
- Identify the problem: Define the problem or question for which you require LLM's assistance. Make sure it is clear and specific.
- Create multiple prompts: Develop various prompts that approach the problem from different angles or perspectives. Each prompt should provide a unique reasoning path for the AI to follow.
- Generate responses: Submit the prompts to LLM and obtain the responses generated by the model.
- Evaluate consistency: Analyze the generated responses to determine their coherence, relevance, and consistency. This step may involve comparing the responses to each other, looking for common themes or patterns, and checking for internal logical consistency.
- Select the best response: Based on the evaluation, choose the most consistent and accurate response as the final answer.
Benefits of Self-Consistency
Implementing self-consistency offers several benefits:
- Improved accuracy: By exploring multiple reasoning paths, the LLM is more likely to arrive at a correct or well-informed response.
- Reduced bias: Considering diverse perspectives helps counteract biases that may be inherent in the AI model or the data it has been trained on.
- Enhanced critical thinking: Encouraging the AI to evaluate its own reasoning promotes critical thinking, which is an essential skill for problem-solving and decision-making.
Example of Self-Consistency Prompting
Prompt:
When I was 6 my sister was half my age. NowI’m 70 how old is my sister?
Output:
73
The results are incorrect! How can we enhance this by ensuring self-consistency? Let's give it a shot. We'll utilize a selection of few-shot examples from Wang et al. 2022 (Table 17) work for this purpose.
In the following prompt, we provide multiple examples of a similar problem and the manner in which we would work to come up with the correct answer.
Prompt:
Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done,there will be 21 trees. How many trees did the grove workers plant today?
A: We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted.So, they must have planted 21 - 15 = 6 trees. The answer is 6.
Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?
A: There are 3 cars in the parking lot already. 2 more arrive. Now there are 3 + 2 = 5 cars. The answer is 5.
Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?
A: Leah had 32 chocolates and Leah’s sister had 42. That means there were originally 32 + 42 = 74chocolates. 35 have been eaten. So in total they still have 74 - 35 = 39 chocolates. The answer is 39.
Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipopsdid Jason give to Denny?
A: Jason had 20 lollipops. Since he only has 12 now, he must have given the rest to Denny. The number oflollipops he has given to Denny must have been 20 - 12 = 8 lollipops. The answer is 8.
Q: Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys doeshe have now?
A: He has 5 toys. He got 2 from mom, so after that he has 5 + 2 = 7 toys. Then he got 2 more from dad, soin total he has 7 + 2 = 9 toys. The answer is 9.
Q: There were nine computers in the server room. Five more computers were installed each day, frommonday to thursday. How many computers are now in the server room?
A: There are 4 days from monday to thursday. 5 computers were added each day. That means in total 4 * 5 =20 computers were added. There were 9 computers in the beginning, so now there are 9 + 20 = 29 computers.The answer is 29.
Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How manygolf balls did he have at the end of wednesday?
A: Michael initially had 58 balls. He lost 23 on Tuesday, so after that he has 58 - 23 = 35 balls. OnWednesday he lost 2 more so now he has 35 - 2 = 33 balls. The answer is 33.
Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
A: She bought 5 bagels for $3 each. This means she spent 5
Q: When I was 6 my sister was half my age. Now I’m 70 how old is my sister?
A:
Output 1:
When I was 6 my sister was half my age, so she was 3. Now I am 70, so she is 70 - 3 = 67. The answer is 67.
Output 2:
When the narrator was 6, his sister was half his age, which is 3. Now that the narrator is 70, his sister would be 70 - 3 = 67 years old. The answer is 67.
Output 3:
When I was 6 my sister was half my age, so she was 3. Now I am 70, so she is 70/2 = 35. The answer is 35.
Using Self-consistency in Prompt Recipes & Templates
Incorporating this methodology into your prompt recipes is straightforward. Simply include an "example" element within the instruction portion of your prompt. In this section, you can provide clear and explicit examples to guide the model. For newer and more advanced LLMs, you may only need 1-3 examples to achieve desired results. However, it is essential to test and optimize your prompts for your specific use case to ensure optimal performance.
Summary
In summary, self-consistency is a powerful technique that can significantly enhance the performance of an LLM. Incorporating multiple reasoning paths and diverse perspectives, improves the accuracy and reliability of the AI's responses, making it a valuable tool for various applications.