[Dev Catch Up # 102] - Gemini Embedding 2, Andrew Ng's Context Hub, Nemotron 3 Super, Perplexity's Personal Computer, gstack, AMI Labs, Agent 4, Karpathy's Auto Research, RCLI-Voice AI for Mac & more!
Bringing devs up to speed on the latest dev news from the trends including, a bunch of exciting developments and articles
Welcome to the 102nd edition of DevShorts, Dev Catch Up.
For those who joined recently or are reading Dev Catch Up for the first time, I write about developer stories and open source, partly based on my work and experience interacting with people all over the globe.
Thanks for reading Dev Shorts! Subscribe for free to receive new posts and support my work.
Some recent issues from Dev Catch up:
Join 8700+ developers to hear stories from Open source and technology.
Must Read
Google has released Gemini Embedding 2, its first natively multimodal embedding model. It can map text, images, video, audio, and documents into one embedding space. This is useful for multimodal RAG, search, and retrieval use cases. Check Google’s Gemini Embedding 2 announcement for more details.
Andrew Ng has released Context Hub. It helps coding agents use curated and versioned docs through a CLI. It is designed to reduce API hallucinations and help agents improve across tasks. Check the Context Hub GitHub repo for more details.
NVIDIA has released Nemotron 3 Super, a new open 120B model for agentic AI. It is built for long and complex agent workflows. It supports a 1 million token context window and higher throughput. Check NVIDIA’s Nemotron 3 Super announcement for more details.
Perplexity has opened the waitlist for Personal Computer. It brings Perplexity Computer and Comet Assistant to your own system. Perplexity says sensitive actions need your approval and all actions are logged. Check Perplexity’s page for more details.
OSS Highlight of the Week
This week we are featuring RCLI. It is an on device voice AI for macOS. You can control macOS actions with voice and also query your documents with voice. No cloud and no API keys. Check the RCLI GitHub repo for more details.
Good to know
Cloudflare has added a new /crawl endpoint in Browser Rendering. It lets developers crawl an entire website with one API call. It returns the content in HTML, Markdown, and structured JSON. Check Cloudflare’s post for more details.
Microsoft has introduced Copilot Cowork. It helps Copilot turn a request into a plan and take action. It works using signals from Microsoft 365 apps like Outlook, Teams, and Excel. Check Microsoft’s post for more details.
OpenAI has opened applications for Codex for Open Source. Selected maintainers get ChatGPT Pro with Codex, Codex Security, and API credits. Check OpenAI’s page for more details.
Anthropic has launched Claude Marketplace. It allows customers to buy Claude powered tools from partners like GitLab, Replit, and Snowflake using their Anthropic commitment. Check Anthropic’s page for more details.
OpenAI has added interactive learning for math and science in ChatGPT. Now you can learn concepts with visual explanations inside ChatGPT. This makes formulas and graphs easier to understand. Check OpenAI’s post for more details.
Notable FYIs
Ex Meta Chief Scientist Yann LeCun has launched AMI Labs. AMI Labs aims to solve the limits of standard language models by building world models. Its initial focus is on companies that run complex systems, like automotive, aerospace, biomedical, and pharma. Check the AMI Labs site for more details.
Y Combinator CEO Garry Tan has released gstack. It brings his Claude Code setup with 6 opinionated tools for roles like CEO, Engineering Manager, Release Manager, and QA Engineer. It is built to turn one coding agent into a team of specialist workflows. Check the gstack GitHub repo for more details.
Andrej Karpathy has released autoresearch. It lets AI agents run small LLM training experiments automatically on a single GPU. The agent edits the training code, runs a fixed 5 minute experiment, checks the result, and repeats. Check the autoresearch GitHub repo for more details.
Came across this X post where a techie shared how he hacked Perplexity Computer and got access to the Claude Code API key used by Perplexity. He also shared a safer pattern for handling keys in multi agent systems. Check the X post for more details.
Hume has open sourced TADA, a speech language model for text to speech. It is designed for faster generation and to reduce hallucinated words in speech output. This could be useful for teams building voice products. Check Hume’s post for more details.
Replit has introduced Agent 4 for more creative app building workflows. It can generate UI variants and handle multiple tasks in parallel. This helps developers move faster while building inside Replit. Check Replit’s Agent 4 announcement for more details
That’s it from us with this edition. We hope you are going away with a ton of new information. Lastly, share this newsletter with your colleagues and pals if you find it valuable. A subscription to the newsletter will be awesome if you are reading it for the first time.


