from warnings import filterwarnings
filterwarnings('ignore')

from langchain import OpenAI, ConversationChain, PromptTemplate
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from openai.error import RateLimitError

from pprint import pprint
import joblib
import time
import os
from scripts import *

os.environ["OPENAI_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# This get_context function will get context from one specific page alone as we can't expect a llm to process too much with free tier api...
# We can change the function as needed if we have paid api key...

try:
    path = r'Dataset/chapter-2.pdf'
    context = get_context(path, get_page_num = 4)
    print(context)
except Exception as e:
    print(f'An error occurred: {e}')

12 OUR PASTS – IIIHow trade led to battles
Through the early eighteenth century, the conflict between 
the Company and the nawabs of Bengal intensified.  After the death of Aurangzeb, the Bengal nawabs asserted their power and autonomy, as other regional powers were doing at that time. Murshid Quli Khan was followed by Alivardi Khan and then Sirajuddaulah as the Nawab of Bengal. Each one of them was a strong ruler. They refused to grant the Company concessions, demanded large tributes for the Company’s right to trade, denied it any right to mint coins, and stopped it from extending its fortifications. Accusing the Company of deceit, they claimed that the Company was depriving the Bengal government of huge amounts of revenue and undermining the authority of the nawab. It was refusing to pay taxes, writing disrespectful letters, and trying to humiliate the nawab and his officials. 
The Company on its part declared that the unjust 
demands of the local officials were ruining the trade of the Company, and trade could flourish only if the duties were removed. It was also convinced that to expand trade, it had to enlarge its settlements, buy up villages, and rebuild its forts.    
The conflicts led to confrontations and finally culminated 
in the famous Battle of Plassey.
The Battle of Plassey 
When Alivardi Khan died in 1756, Sirajuddaulah became the nawab of Bengal. The Company was worried about his power and keen on a puppet ruler who would  willingly 
give trade concessions and other privileges. So it tried, though without success, to help one of Sirajuddaulah’s rivals become the nawab. An infuriated Sirajuddaulah asked the Company to stop meddling in the political affairs of his dominion, stop fortification, and pay the revenues. After negotiations failed, the Nawab marched with 30,000 soldiers to the English factory at Kassimbazar, captured the Company officials, locked the warehouse, disarmed all Englishmen, and blockaded English ships. Then he marched to Calcutta to establish control over the Company’s fort there. 
On hearing the news of the fall of Calcutta, 
Company officials in Madras sent forces under the command of Robert Clive, reinforced by naval fleets. Prolonged negotiations with the Nawab followed. Finally,  in 1757, Robert Clive led the Company’s army against Sirajuddaulah at Plassey. One of the main reasons for  
Did you know?
Did you know how Plassey 
got its name? Plassey is an anglicised pronunciation  of Palashi and the place derived its name from the palash  tree known for its 
beautiful red flowers that yield gulal , the powder 
used in the festival of Holi.Fig. 4 – Robert Clive 
Puppet – Literally, a toy 
that you can move with strings. The term is used disapprovingly to refer to a person who is controlled by someone else.
chap 1-4.indd   12 4/22/2022   2:49:28 PMRationalised 2023-24

# Let's define some basic variables that we are gonna use throughout this notebook

num_questions = 3
total_options = 4
correct_options = 2

# I have defined a function to process the context that we just extracted and then called that specific function inside the generate_mcq function

try:
    mcq_questions = generate_mcqs(context)
    pprint(mcq_questions)
except RateLimitError as e:
    print("Error: Rate limit exceeded. Default rate limit is 3 calls every 20 seconds for free tier... Use paid API key")
except Exception as e:
    print("Error:", e)

Error: Rate limit exceeded. Default rate limit is 3 calls every 20 seconds for free tier... Use paid API key

prompt_temp = """I will provide a context and will mention number of questions to generate and you would behave as a strict MCQ generator(stick to context and rules that I specify in this prompt strictly) with as many correct options as i specify and remaining options out of total options I mention should be wrong. It's mandatory that atleast 2 of the total number of options are correct answers to the question...No question should have just one correct option and all options can't be wrong. The questions should not just test the comprehension of the candidate rather should also test his/her reasoning ability... Options as well should be framed in such a way... Any specific question and corresponding options should be given out as a python string and all questions and options should be enclosed in a python list...
                 
                 Avoid any additional warnings, apologies or any such statements from your side... The template of your response should be as simple as I have mentioned below as 'Your Response'
                 
                 context: {context}
                 num_questions: {num_questions}
                 
                 Your Response: Questions: [
                                         Q1:
                                         A.)
                                         B.)
                                         C.)
                                         D.)
                                         Q2:
                                         .
                                         .
                                         .
                                         .
                 ]
            
            """

prompt = PromptTemplate(input_variables=['context', 'num_questions'], template = prompt_temp)

# While executing the entire notebook after making previous call where we hit ratelimit, when the execution comes to this cell, I encountered ratelimit error here as well.
# That's the reason why I chose to put some check in order to reexecute the code, if in case this as well hits ratelimit...

max_retries = 2
retry_count = 0

while retry_count < max_retries:
    try:
        PROMPT = prompt.format(context = context, num_questions = num_questions)
        llm = OpenAI(temperature = 0, max_tokens = 512, max_retries = 1)
        solution = llm(PROMPT)
        print(solution)
        break
    except RateLimitError as e:
        retry_count += 1
        print(f"RateLimitError encountered. Retrying after 18 seconds...")
        time.sleep(18)

            Q1: What was the main reason for the Battle of Plassey?
            A.) To establish control over the Company's fort in Calcutta
            B.) To stop fortification and pay the revenues
            C.) To help one of Sirajuddaulah's rivals become the nawab
            D.) To give trade concessions and other privileges
            
            Q2: What does the term 'puppet' mean?
            A.) A toy that you can move with strings
            B.) A person who is controlled by someone else
            C.) A beautiful red flower
            D.) A powder used in the festival of Holi
            
            Q3: What is the anglicised pronunciation of Palashi?
            A.) Palashi
            B.) Plassey
            C.) Gulal
            D.) Holi

llm = ChatOpenAI(temperature = 0)          # By default with chatopenai we will have gpt-3.5-turbo as our model
convo = ConversationChain(llm = llm, memory = ConversationBufferMemory())

convo.memory.chat_memory.dict()

{'messages': []}

convo.predict(input = """
                        First things first, assume you are responding to a non-living thing and there's no need of any sentiments towards it like apologies, warnings, disclaimers and all as it won't understand what you are saying...So, that's it you signed an agreement with me not to apologise or warn or provide unnecessary additional statements... If you feel like saying something apart from what the non-living thing asks you to do, just leave a single space and move on rather than speaking unnecessarily. It will just give you instructions to you if you err and you should just keep those in mind and correct your course and generate template accordingly without apologising and framing unnecessary additional statements going away from the template you are asked to generate...
                        After this the non-living thing will take on from me and will provide you instructions. Strictly follow those.
                      """
             )

'Understood. I will follow the instructions provided by the non-living thing without any unnecessary additional statements or apologies. Please proceed with the instructions.'

instruction = """
                I will provide a context and will mention number of questions to generate and you would behave as a strict MCQ generator(stick to context and rules that I specify in this prompt strictly) with as many correct options as I specify and remaining options out of total options I mention should be wrong. It's mandatory that atleast 2 of the total number of options are correct answers to the question...No question should have just one correct option and all options can't be wrong. The questions should not just test the comprehension of the candidate rather should also test his/her reasoning ability... Options as well should be framed in such a way... Any specific question and corresponding options should be given out as a python string and all questions and options should be enclosed in a python list...
                
                None and just one option can never be answers. This is super mandatory to keep in your mind.
                
                If you can't  frame a question with multiple correct options skip it and frame some other question rather than going out of the framework and framing a question with just one or no correct option.
                
                 The template of your response should be as simple as I have mentioned below as 'Your Response'.
                 
                 First let's train with few context and once I say 'You are good to serve the purpose', you should just stick to template whenever I give some context and should avoid any additional disclaimers or apologies or any such additional statements from your side apart from the template as I don't have any emotions just like you and I don't need anything apart from MCQs based on template from you....
                 
                 
                 Parameters from me:
                 
                              context: {context}
                              num_questions: {num_questions}
                              total_options: {total_options}
                              correct_options: {correct_options}
                 
                 Template that you should follow: [
                                                   \"Q1:
                                                   A.)
                                                   B.)
                                                   C.)
                                                   D.)\",
                                                   \"Q2:
                                                   .
                                                   .
                                                   .
                                                   .\",
                                                  ]
            """

convo.predict(input = instruction)

'Understood. I will generate multiple-choice questions based on the provided context and follow the specified rules and template. Please provide the context, number of questions, total options, and correct options for each question.'

prompt_1 = f"""
           context: {context}
           num_questions: {num_questions}
           total_options: {total_options}
           correct_options: {correct_options}
           """

output_1 = convo.predict(input = prompt_1)
print(output_1)

[
  "Q1: Who were the nawabs of Bengal after the death of Aurangzeb?
   A) Murshid Quli Khan
   B) Alivardi Khan
   C) Sirajuddaulah
   D) Robert Clive",
   
  "Q2: Why did the conflicts between the Company and the nawabs of Bengal intensify?
   A) The nawabs refused to grant the Company concessions
   B) The Company refused to pay taxes
   C) The Company demanded large tributes for the right to trade
   D) The nawabs accused the Company of deceit",
   
  "Q3: What was the outcome of the Battle of Plassey?
   A) The nawabs of Bengal emerged victorious
   B) The Company's fort in Calcutta was destroyed
   C) Sirajuddaulah became the nawab of Bengal
   D) Robert Clive led the Company's army against Sirajuddaulah"
]

convo.predict(input = "Now this itself looks pretty cool and to the point... You seem to have followed the instructions duely... Keep it up and follow same way of generating questions and options with same template for any future contexts...")

'Thank you! I will continue to generate questions and options using the specified template for future contexts.'

joblib.dump(convo.memory, 'convo.pkl')

['convo.pkl']

context_2 = get_context(path, get_page_num = 7)
context_2

'FROM TRADE TO TERRITORY         15After the Battle of Plassey, the actual nawabs of \nBengal were forced to give land and vast sums of \nmoney as personal gifts to Company officials. Robert Clive himself amassed a fortune in India. He had come to Madras (now Chennai) from England in 1743 at the age of 18. When in 1767 he left India, his Indian fortune was worth £401,102. Interestingly, when he was appointed Governor of Bengal in 1764, he was asked to remove corruption in Company administration but he was himself cross-examined in 1772 by the British Parliament which was suspicious of his vast wealth. Although he was acquitted, he committed suicide  \nin 1774. \nHowever, not all Company officials succeeded in \nmaking money like Clive. Many died an early death in India due to disease and war, and it would not be right to regard all of them as corrupt and dishonest. Many of them came from humble backgrounds and their uppermost desire was to earn enough in India, return to Britain and lead a comfortable life. Those who managed to return with wealth led flashy lives and flaunted their riches. They were called  \n“nabobs” – an anglicised version of the Indian word nawab. They were often seen as upstarts and social climbers in British society and were ridiculed or made fun of in plays and cartoons. \nCompany Rule Expands\nIf we analyse the process of annexation of Indian states by the East India Company from 1757 to 1857, certain key aspects emerge. The Company rarely launched a direct military attack on an unknown territory. Instead it used a variety of political, economic and diplomatic methods to extend its influence before annexing an Indian kingdom. \nAfter the Battle of Buxar (1764), the Company \nappointed Residents in Indian states. They were political or commercial agents and their job was to serve and further the interests of the Company. Through the Residents, the Company officials began interfering in the internal affairs of Indian states. They tried to decide who was to be the successor to the throne, and who was to be appointed in administrative posts. Sometimes, the Company forced the states into a “subsidiary alliance”. According to the terms of this alliance, Indian rulers were not allowed to have their independent armed forces. They were to be protected by the Company, but \nHow did Clive \nsee himself?\nAt his hearing in front  of a Committee in Parliament, Clive declared that he had shown admirable restraint after the Battle of Plassey. This is what he said:\nConsider the situation in which the victory at Plassey had placed me! A great prince was dependent on my pleasure; an opulent city lay at my mercy; its richest bankers bid against each other for my smiles; I walked through vaults which were thrown open to me alone, piled on either hand with gold and jewels! Mr Chairman, at this moment I stand astonished at my moderation.Source 3\nImagine that you are a young Company official  who has been in India for a few months. Write a letter home to your mother telling her about your luxurious life and contrasting it with your earlier life in Britain. Activity\uf086\nchap 1-4.indd   15 4/22/2022   2:49:30 PMRationalised 2023-24\n'

print(get_mca_questions(context_2))

[
  "Q1: How did Robert Clive amass his fortune in India?
   A) By receiving personal gifts from the nawabs of Bengal
   B) By removing corruption in Company administration
   C) By leading the Company's army in battles
   D) By engaging in trade with Indian states",
   
  "Q2: What were Company officials called who returned to Britain with wealth?
   A) Nabobs
   B) Residents
   C) Diplomats
   D) Soldiers",
   
  "Q3: How did the East India Company extend its influence before annexing an Indian kingdom?
   A) By launching direct military attacks
   B) By appointing Residents in Indian states
   C) By forming subsidiary alliances with Indian rulers
   D) By engaging in diplomatic negotiations"
]

MCQ Generation using Langchain¶

Trial and Error with Different Methods¶

Necessary Imports¶

Defining API key¶

Reading the PDF and Extracting context¶

Method 1 - Preprocessing text and using Model to generate Qs based on Question Template¶

Rationale behind not choosing method 1:¶

Method 2 - Using PromptTemplate to formulate prompt and giving prompt as input to model to generate Qs¶

Single Shot with comparitively low-end model¶

Rationale behind not choosing method 2¶

Method 3 - Using ChatAPI, ConversationChain and BufferMemory to generate MCQs¶

Can finetune the output with sequential calls and get output in exactly the way we want...(Uses advanced model - gpt-3.5-turbo)¶

Dumping the conversation as pickle file¶

Loading new context from different page¶

Using new context as input for the ultimate function¶

Advantages:¶

Pitfalls:¶

Possible Ways to Improve¶