[Dev Catch up #14] - GigaGPT, Meta audiobox, Business-Intelligence-as-code, and much more.
Bringing devs up to speed on the latest dev news from the trends including the introduction of GigaGPT, Meta's new voice AI model Audiobox, and a bunch of exciting developments and articles.
The evolution of tech with new developments and happenings is constant. And as usual, DevShorts is back with another issue to simplify your digests from the community. Like our last ones, this issue also covers the unique stories trending in our developer circle, along with a look at new open-source projects, tutorials, conference news, and much more.
Must Read
Large Language Models or LLMs are getting popular day by day and with the rise of AI, more and more organizations are introducing their own version of LLM models. The launch of the first GPT model by Open AI took the internet by storm, and since then more and more LLM models have come into the limelight. Cerebras introduced GigaGPT, which is the simplest and most compact codebase to train and finetune GPT models. It is the implementation of nanoGPT and trains models over 100 billion parameters without involving extra codes or third-party frameworks. With only 565 lines of code, it uses the large memory and computing capacity of Cerebras hardware to enable large-scale training on the vanilla torch.nn code. Learn more about GigaGPT from this article published by Cerebras, which gives a deeper understanding of the model along with its working principle.
Analytics is an important aspect when it comes to running a business dealing with data. Hence, an analytics dashboard is an important software asset. A good analytics dashboard must have undergone testing along with appropriate versioning. Moreover, it can exist in different environments including both staging and production. You can access the analytics dashboard using several Business Intelligence (BI) tools but they are designed with a UI-focused approach and do not give any solution regarding prevention of breaking or rollback of UI. This gives rise to BI-as-code tools that have addressed long-standing challenges in the field. BI-as-code is the methodology that provides versioning and enables testing through proper CI pipelines. Motherduck has presented an excellent article that discusses the impact of various BI-as-code tools with DuckDB and how it is shaping the future of BI.
Meta introduced some of the best AI models throughout 2023 and recently, the collection got its latest addition with the introduction of Audiobox. Earlier in 2023, Meta introduced Voicebox, a generative AI model that can perform speech-generation tasks like editing, sampling, and stylizing. Successor to this model comes Audiobox, the advanced generative AI model that can unify generation and editing capabilities for speech, sound effects, etc. through a variety of input mechanisms that maximize controllability for each use case. People can use natural language prompts to generate a sound or speech of their choice. This article from the engineering team at Meta discusses in detail Audiobox and the engineering behind it with the help of demos.
The security of your first-party code is essential to any developer and there are various methods to do that. The most common among them is the utilization of static code analysis tools that scan your code before deployment in different environments like testing, production, and staging. However, the limitation of this method is that it provides you with insights into the static aspect of your code. To tackle this, industry people often rely on dynamic application security tools that perform black-box tests and examine your application from an external perspective. Although this approach gives a detailed perspective, it still misses out on showing the overall security situation. To address these issues, comes the New Relic interactive application security testing which enhances your observability with exceptional visibility into the dynamic aspects of your application. This happens because of observing the code as it is being executed. Learn more about New Relic IAST from this article published by Harry Kimpel of New Relic where a detailed tutorial on using IAST has been explained in detail.
AI and ML have seen some serious improvements throughout the whole of 2023. Developers and organizations are chiming in to develop more models with the help of fine-tuning. Fine-tuning is the technique of training pretrained models on smaller and task-specific datasets. Hugging Face recently demonstrated the ability to fine-tune your personal code copilot on your own codebases using Python and the Transformers library. Code completion models are a great innovation for programmers as they boost their productivity. The same can be fine-tuned with the help of Elixir programming language and this article from Dockyard explains on how you can do that and create a code copilot on your own.
Now, we will head over to some of the news and articles that will be at a place of interest for developers and the tech community out there.
Good to know
A lot of applications use user data to deliver useful features. But data is sensitive and personal data in the wrong hands can open the door to a whole arena of privacy risk. However, companies are not interested in collecting individual data points of a user rather they want aggregated data. The Distributed Aggregation Protocol or DAP allows the aggregation of data without releasing any individual data point. This protocol becomes important when a data collector is interested in general trends over a population without getting to access sensitive data. This is a step in the right direction but you will not be able to protect privacy through private aggregation alone. Hence, comes the concept of differential privacy. This blog from Cloudflare talks about the shortcomings of DAP and how they are improving it with the help of differential privacy.
An efficient Incident Management process requires a clear definition of the Incident Severity levels. This helps engineering teams to have a quick response to the outages and mitigate customer impact. If you run an online platform that needs high availability in multiple sectors, then it is a complex task and every part has to work perfectly so that the customers can have a satisfactory experience. If any of the parts fail, then there should be an Incident Management process that will coordinate ways to mitigate the problems and resolve problems as fast as possible to bring the system back in order. Learn more about this process from this article published by Jonathan Word, where the severity levels of incidents have been explained in detail along with the best possible response scenario.
In our previous edition, we mentioned how Llamafile is the new best way to run an LLM on your own computer. The installation is easy and performing complex tasks becomes simpler when you can apply one-liner bash commands to execute those tasks. From filename generation, and URL summarization to code completion and email composition, you can execute all of these tasks through big one-liner commands. This article from Justine Tunney shows how you can install Llamafile on your local machine and apply the one-liner bash commands to do some of the most complex tasks.
The open-source tool of the week is Zilla and it has raked up quite a number of stars on GitHub. It is a multi-protocol event native proxy developed by Aklivity. It can securely interface web apps, IoT clients, and microservices to Apache Kafka through declaratively defined stateless APIs. With the ability to natively support the Kafka Wire protocol, it uses advanced protocol mediation to establish stateless API entry points into Kafka. It does not depend on the Kafka Consumer/Producer API or Kafka Connect and has no external dependencies. You can check out Zilla from their official GitHub page and can leave a star to support it.
Lastly, we will take a look at some of the trending scoops that hold a special mention for the community.
Notable FYIs
Meta is one of the leaders in the race when it comes to Generative AI and LLMs. With that being said, a few days ago, they released a free AI image generator website based on its emu-image synthesis model which is trained with 1.1 billion publicly available images on Facebook and Instagram. It can render a new image from a written prompt and more details on the website and the model can be found in this article published by Arstechnica.
Meta once again gains attention as they recently announced that they are bringing end-to-end encryption to their already existing Facebook Messenger. They released two whitepapers explaining the cryptographic protocol for transmitting messages between clients and the encrypting protocol to store messaging history between devices on a user’s account. Here is an article from the Meta engineering team where you will have more details on the new updates on E2EE.
As a developer, you aim to focus more on your work and write less code with increasing productivity. To achieve that, you often use libraries. If you use Next.js daily or planning to use Next.js for your upcoming development project, here is an article from Nevo David that lists a few Next.js libraries in detail that will help you develop great stuff using the framework.
Cloud-native monitoring is more interesting than ever with the emergence of OpenTelemetry and people are keen on knowing about this technology. Here is a Q&A video from the official OpenTelemetry YouTube channel where Jennifer Moore has shared her experience regarding focus on Observability during a challenging migration situation and answered some burning questions on OTel.
iMessage is the most popular messenger when it comes to IOS. Here is a brief article from Jjtech explaining a cursory overview of the internals in iMessage. It discusses the foundational layer of the messenger to the underlying message encryption and has provided significant resources to look into.
That’s it from us with this edition. We hope you are going away with a ton of new information. Lastly, share this newsletter with your colleagues and pals if you find it valuable and a subscription to the newsletter will be awesome if you are reading for the first time.