Learn to use Google’s Germini-1.5-pro-latest mannequin to develop a generative AI app for calorie counting
Have you ever ever questioned the quantity of energy you devour while you eat your dinner, for instance? I do this on a regular basis. Wouldn’t or not it’s fantastic if you happen to might merely cross an image of your plate by means of an app and get an estimate of the entire variety of energy earlier than you resolve how far in you wish to dip?
This calorie counter app that I created may help you obtain this. It’s a Python utility that makes use of Google’s Gemini-1.5-Professional-Newest mannequin to estimate the variety of energy in meals gadgets.
The app takes two inputs: a query in regards to the meals and a picture of the meals or meals gadgets, or just, a plate of meals. It outputs a solution to the query, the entire variety of energy within the picture and a breakdown of energy by every meals merchandise within the picture.
On this article, I’ll clarify the complete end-to-end strategy of constructing the app from scratch, utilizing Google’s Gemini-1.5-pro-latest (a Massive Language generative AI mannequin launched by Google), and the way I developed the front-end of the applying utilizing Streamlit.
It’s price noting right here that with developments on the earth of AI, it’s incumbent on information scientists to steadily shift from conventional deep studying to generative AI strategies with the intention to revolutionize their function. That is my major function of teaching on this topic.
Let me begin by briefly explaining Gemini-1.5-pro-latest and the streamlit framework, as they’re the foremost parts within the infrastructure of this calorie counter app.
Gemini-1.5-pro-latest is a sophisticated AI language mannequin developed by Google. Since it’s the newest model, it has enhanced capabilities over earlier variations within the mild of sooner response occasions and improved accuracy when utilized in pure language processing and constructing purposes.
This can be a multi-modal mannequin that works with each texts and pictures — an development from Google Gemini-pro mannequin which solely works with textual content prompts.
The mannequin works by understanding and producing textual content, like people, primarily based on prompts given to it. On this article, this mannequin might be used to to generate textual content for our energy counter app.
Gemini-1.5-pro-latest could be built-in into different purposes to strengthen their AI capabilities. On this present utility, the mannequin makes use of generative AI strategies to interrupt the uploaded picture into particular person meals gadgets . Based mostly on its contextual understanding of the meals gadgets from its dietary database, it makes use of picture recognition and object detection to estimate the variety of energy, after which totals up the energy for all gadgets within the picture.
Streamlit is an open-source Python framework that may handle the person interface. This framework simplifies internet improvement in order that all through the mission, you don’t want to put in writing any HTML and CSS codes for the entrance finish.
Allow us to dive into constructing the app.
I’ll present you find out how to construct the app in 5 clear steps.
1. Arrange your Folder construction
For a begin, go into your favourite code editor (mine is VS Code) and begin a mission file. Name it Energy-Counter, for instance. That is the present working listing. Create a digital setting (venv), activate it in your terminal, after which create the next recordsdata: .env, energy.py, necessities.txt.
Right here’s a suggestion for the look of your folder construction:
Energy-Counter/├── venv/│ ├── xxx│ ├── xxx├── .env├── energy.py└── necessities.txt
Please be aware that Gemini-1.5-Professional works finest with Python variations 3.9 and larger.
2. Get the Google API key
Like different Gemini fashions, Gemini-1.5-pro-latest is presently free for public use. Accessing it requires that you simply acquire an API key, which you will get from Google AI Studio by going to “Get API key” on this hyperlink. As soon as the bottom line is generated, copy it for subsequent use in your code. Save this key as an setting variable within the .env file as follows.
GOOGLE_API_KEY=”paste the generated key right here”
3. Set up dependencies
Sort the next libraries into your necessities.txt file.
streamlitgoogle-generativeaipython-dotenv
Within the terminal, set up the libraries in necessities.txt with:
python -m pip set up -r necessities.txt
4. Write the Python script
Now, let’s begin writing the Python script in energy.py. With the next code, import all required libraries:
# import the librariesfrom dotenv import load_dotenvimport streamlit as stimport osimport google.generativeai as genaifrom PIL import Picture
Right here’s how the assorted modules imported might be used:
dotenv — Since this utility might be configured from a Google API key setting variable, dotenv is used to load configuration from the .env file.Streamlit — to create an interactive person interface for front-endos module is used to deal with the present working listing whereas performing file operations like getting the API key from the .env filegoogle.generativeai module, in fact, provides us entry to the Gemini mannequin we’re about to make use of.PIL is a Python imaging library used for managing picture file codecs.
The next traces will configure the API keys and cargo them from the setting variables retailer.
genai.configure(api_key=os.getenv(“GOOGLE_API_KEY”))
load_dotenv()
Outline a perform that, when referred to as, will load the Gemini-1.5-pro-latest and get the response, as follows:
def get_gemini_reponse(input_prompt,picture,user_prompt):mannequin=genai.GenerativeModel(‘gemini-1.5-pro-latest’)response=mannequin.generate_content([input_prompt,image[0],user_prompt])return response.textual content
Within the above perform, you see that it takes as enter, the enter immediate that might be specified additional down within the script, a picture that might be equipped by the person, and a person immediate/query that might be equipped by the person. All that goes into the gemini mannequin to return the response textual content.
Since Gemini-1.5-pro expects enter photographs within the type of byte arrays, the subsequent factor to do is write a perform that processes the uploaded picture, changing it to bytes.
def input_image_setup(uploaded_file):# Verify if a file has been uploadedif uploaded_file just isn’t None:# Learn the file into bytesbytes_data = uploaded_file.getvalue()
image_parts = [{“mime_type”: uploaded_file.type, # Get the mime type of the uploaded file”data”: bytes_data}]return image_partselse:increase FileNotFoundError(“No file uploaded”)
Subsequent, specify the enter immediate that may decide the behaviour of your app. Right here, we’re merely telling Gemini what to do with the textual content and picture that the app might be fed with by the person.
input_prompt=”””You’re an skilled nutritionist. You must reply the query entered by the person within the enter primarily based on the uploaded picture you see.You also needs to have a look at the meals gadgets discovered within the uploaded picture and calculate the entire energy. Additionally, present the main points of each meals merchandise with energy consumption within the format under:
1. Merchandise 1 – no of calories2. Merchandise 2 – no of calories——–
“””
The following step is to initialize streamlit and create a easy person interface in your calorie counter app.
st.set_page_config(page_title=”Gemini Calorie Counter App”)st.header(“Calorie Counter App”)enter=st.text_input(“Ask any query associated to your meals: “,key=”enter”)uploaded_file = st.file_uploader(“Add a picture of your meals”, sort=[“jpg”, “jpeg”, “png”])picture=”” if uploaded_file just isn’t None:picture = Picture.open(uploaded_file)st.picture(picture, caption=”Uploaded Picture.”, use_column_width=True) #present the picture
submit=st.button(“Submit & Course of”) #creates a “Submit & Course of” button
The above steps have all of the items of the app. At this level, the person is ready to open the app, enter a query and add a picture.
Lastly, let’s put all of the items collectively such that when the “Submit & Course of” button is clicked, the person will get the required response textual content.
# As soon as submit&Course of button is clickedif submit:image_data=input_image_setup(uploaded_file)response=get_gemini_reponse(input_prompt,image_data,enter)st.subheader(“The Response is”)st.write(response)
5. Run the script and work together along with your app
Now that the app improvement is full, you may execute it within the terminal utilizing the command:
streamlit run energy.py
To work together along with your app and see the way it performs, view your Streamlit app in your browser utilizing the native url or community URL generated.
This how your Streamlit app seems like when it’s first opened on the browser.
As soon as the person asks a query and uploads a picture, right here is the show:
As soon as the person pushes the “Submit & Course of” button, the response within the picture under is generated on the backside of the display.
For exterior entry, contemplate deploying your app utilizing cloud companies like AWS, Heroku, Streamlit Neighborhood Cloud. On this case, let’s use Streamlit Neighborhood Cloud to deploy the app without cost.
On the highest proper of the app display, click on ‘Deploy’ and comply with the prompts to finish the deployment.
After deployment, you may share the generated app URL to different customers.
Identical to different AI purposes, the outcomes outputed are the perfect estimates of the mannequin, so, earlier than utterly counting on the app, please be aware the next as a few of the potential dangers:
The calorie counter app might misclassify sure meals gadgets and thus, give the mistaken variety of energy.The app doesn’t have a reference level to estimate the dimensions of the meals — portion — primarily based on the uploaded picture. This could result in errors.Over-reliance on the app can result in stress and psychological well being points as one might turn into obsessive about counting energy and worrying about outcomes that might not be too correct.
To assist cut back the dangers that include utilizing the calorie counter, listed below are doable enhancements that might be built-in into its improvement:
Including contextual evaluation of the picture, which is able to assist to gauge the dimensions of the meals portion being analysed. As an example, the app might be constructed such that a regular object like a spoon, included within the meals picture, might be used as a reference level for measuring the sizes of the meals gadgets. This may cut back errors in ensuing whole energy.Google might enhance the range of the meals gadgets of their coaching set to cut back misclassification errors. They may broaden it to incorporate meals from extra cultures in order that even uncommon African meals gadgets might be recognized.