AI Architect by MindStudio
Posts
Rename MindStudio blocks & Digest of The Last Weeks in AI

Rename MindStudio blocks & Digest of The Last Weeks in AI

Rename blocks in MindStudio canvas. Learn about new models and their integration with MindStudio.

Giorgio Barilla
August 23, 2024

You're receiving this email because you registered to one of our workshops. You can unsubscribe at the bottom of each email at any time.

This week, we revealed a secret feature in production that you can start using today: renaming blocks. We understand that having many text generation blocks can become confusing, and you can now organize your flows effectively. We will deploy a better UI for this in the coming weeks.

In the last couple of weeks, we got new models Grok (xAI), the new line of Pixel 9 phones with Gemini Live, significant offers from Google and OpenAI to fine tune models on their platform, Claude prompt caching & longer context window, and a Midjourney level image generation model.

Continue reading to learn more!

Resources for Pros

Watch our new use cases playlist for agencies and implementation specialists

Try out the MindStudio Trainer v2 - available to all users

Which Model to Choose for Your Next AI Build

Learn to use RAG in MindStudio: when is it optimal and when isn’t

What’s coming next

More types of data sources and data retrieval techniques (e.g. GraphRAG)

A better API for power-users and developers

UI for the renaming feature and more models

As a reminder, we’re now welcoming partners that want to build AIs for their clients. Sign up for extra support, training resources, and more here.

🗞️ Industry news

Google steps up their AI game & launches Pixel 9 with Gemini Live

Copyright Google

Google is stepping up their Gemini game once again and announced Gemini Live together with the new Pixel 9 lineup.

Gemini Live is an alternative to ChatGPT Voice Mode. The main difference? This one is actually live for Gemini Advanced users, and all new Pixel 9 users get free access for a year in most regions.

Gemini Live isn't as good as the Voice Mode demo, nor is it as far-reaching, but Google deserves kudos for actually shipping it and making it useful. With access to the latest Gemini 1.5 Pro model and access to your phone sensors and apps, Gemini Live is a step in the right direction for agentive AI on device.

They also released other cool small updates like Add Me to take pictures with friends without asking random strangers, more free tokens for developers to try out Gemini, and a few more you can find here.

Personally, I’m starting to use Gemini 1.5 Flash and Gemini 1.5 Pro in MindStudio more recently. The vision model is very good, and the text completions are fast and creative. For example, they seem to outperform other models for my social media copy generation workflows. Give them a go!
OpenAI says they have a new model. No one noticed

From ChatGPT’s X account

Imagine shipping an update so small you need to tell users one week after… that’s kind of what happened with ChatGPT on Aug 12th.

In fairness, some people did start noticing the output quality in ChatGPT improved slightly, but it’s hard to understand exactly what improved and why. OpenAI itself says most people should stick to GPT-4o and not use the new model.

With all players caught up to GPT-4o’s performance, and OpenAI losing ground for coders (cursor.io, the most popular AI IDE, now defaults to Claude 3.5 Sonnet), people are wondering what they are waiting for to release something a bit more impactful.

This is not to say the new model isn’t great. It works well, it seems to have a longer output, and it comes at the same time as GPT-4o fine tuning with free tokens for training. But again, the difference was so small they had to announce it.

OpenAI is losing momentum more every day, and their old Google-like strategy of teasing features that never come is starting to be ineffective, in my honest opinion. I hope they’re cooking something great, and that their huge waitlists can start emptying out.

As a quick reminder, we’re now waiting for: Sora (Feb 2024), Voice Mode (May 2024), SearchGPT (Jul 2024), and of course GPT-5 (or “project Strawberry”) - but we might need to wait 2025 for this one.
xAI reaches GPT-4o level in a year & partners with top-tier image generator FLUX-1

Grok 2 is very close to GPT-4o’s performance

Elon Musk became quite a controversial figure in the tech space, with his recent endorsement of Donald Trump after the assassination attempt and the massive Twitter Space (The Trump team claims the interview reached 1b people) to promote his agenda.

However, it must be said that the team he brought together for xAI is moving at an insane speed - reminiscent of what OpenAI looked like 2 years ago.

xAI announced Grok 1 in November 2023 and, 10 months after, shipped a model that’s light years ahead of Grok 1 and nearly matches GPT-4o and Claude 3.5 Sonnet. Very, very impressive.

xAI is also committed to open sourcing most of its research, so we might see the models becoming open source in the future.

Grok 2 is a text-only LLM, and cannot generate images or audio. To solve for multimodality, xAI partnered with Black Forest Labs (from the creators of Stable Diffusion) to bring their FLUX-1 model to Grok AI… almost completely uncensored.

Image generated by FLUX-1 in Grok AI and reposted by News AU

Other than NSFW content, the FLUX-1 model in Grok will almost never decline a request. Compared to DALL-E 3, which very frequently rejects your prompt, it’s a breath of fresh air… but it can also be a new level of danger for AI images.

Two main reasons:

- FLUX-1 is really good. So good MidJourney opened their free trial on the web platform to let everyone test it for free. They’re feeling the heat. It will be very hard to distinguish what’s real soon;
- Very influential people are producing controversial images and spreading them to a vast audience.

It’s very easy to create copyright material with FLUX-1

On a separate note, it’s unclear if Black Forest Labs has the resources to stay around for the long term. AI Image Generators frequently get sued, and they might not be financially strong enough to sustain a big lawsuit. The company raised $31m.

🔥 Product Updates

You can now rename blocks in the editor. There’s no UI for renaming yet, and you’re one of the first to know about it!

To rename a block:

Click on the block
Press the “enter” key on your keyboard
If it doesn’t work, refresh the page one time and try again

This will enter editing mode. You can give any name to the block, but we suggest sticking to something short.

Renaming blocks is part of a larger plan to make workflows easier to read. With copy paste, renaming, and more coming soon, you can now create more complex workflows and don’t worry about forgetting what each part does!

💡 Tip of The Week

GPT-4o Vision, GPT-4 Turbo Vision, and Gemini 1.5 Vision are now available as a standalone block in MindStudio - no API required.

The “Analyze Image” block takes in two inputs:

Prompt: what do you want to check in the image, and what do you want the model to give you? For example, do you want to describe the image, find something in it, or label it?

Image URL: this has to be a publicly accessible URL containing your image or a variable in your workflow that links to an image URL. Our new “Image Upload” input type does exactly that: takes in an image and stores it in a URL. You can then use that in the Image URL field of the “Analyze Image” block.

And one output:

Output variable: this is where the query result is saved. The response from GPT-4o Vision or GPT-4 Turbo won't show up directly to the user; it will only be stored as a variable.

Remember you need a “Display Text” block afterwards to display the content of the output variable. Otherwise, your user will never see the response.

Here, we’re displaying an image generated with DALL-E 3 and its caption generated by GPT-4o Vision or Gemini.

If you need the response to inform another prompt, then you can use the variable within a “Generate Text” block before sending it over to an LLM.

You can see an example of Vision in MindStudio here.

🤝 Community Events

If you want to hangout with our team, we usually host a Discord event every Friday @ 3PM Eastern. Join our Discord channel to keep up to date with the hangouts - our entire team is active there.

You can register for upcoming events on our brand new events page here.

Our new webinar series is up on there as well, with the following on-demand webinars:

MindStudio for Partners: Level Up Your Agency With AI Workflows (new video!)
Customize AI Applications With Your Data

Plus, we have new weekly and bi-weekly events:

Thank you for being an invaluable member of our community, it’s always great to see many of you join multiple workshops 🔥

If you’re interested in any topic in particular, feel free to reply and I’ll do my best to include it in the next releases. We’re going to update all of these soon.

🌯 That’s a wrap!

Stay tuned to learn more about what’s next and get tips & tricks for your MindStudio build.

You saw it here first,

Giorgio Barilla
MindStudio Developer & Project Manager @ MindStudio