Managed Cloud Services Overview and News | The New Stack

Install Cloud Foundry on Azure Kubernetes Clusters

Ram Iyengar — Fri, 22 Sep 2023 20:13:11 +0000

Microsoft Azure is a cloud computing platform that offers a broad range of services, including computing, networking, storage, databases, analytics, machine learning, and artificial intelligence. It is a highly scalable and reliable platform that can be used to run a wide variety of workloads.

Azure Kubernetes Clusters is a powerful and easy-to-use platform for deploying and managing containerized applications on Azure. If you are looking for a reliable and scalable way to run your containerized applications, then AKS is a great option.

Over the past few years, the popularity of Microsoft Azure — especially among startups and the open source community — is on the rise. Around this time last year, Azure reported a solid 40% rise in revenue. We, the Cloud Foundry Community, firmly believe that Kubernetes users can benefit greatly from the use of a powerful abstraction in order to manage it better. Installing Cloud Foundry Korifi on Azure Kubernetes Clusters would give exactly that outcome.

What Is Korifi?

Korifi is a Cloud Foundry implementation that runs on Kubernetes. It is a community-driven project that aims to provide a simple and efficient way to deploy and manage cloud native applications on Kubernetes. Korifi aims to preserve the classic Cloud Foundry developer experience, when using Kubernetes. Developers can still use the cf push command to deploy applications to Korifi, and they can still use the Cloud Foundry CLI to manage their applications. Here are some of the benefits of using Korifi.

It is a simple and efficient way to deploy and manage cloud native applications on Kubernetes.
It preserves the classic Cloud Foundry developer experience.
It takes advantage of the many features and capabilities of Kubernetes.

Prerequisites

In order to begin the installation, please install the following tools to start.

Azure also provides a container registry that can be used to upload images built when using Korifi. Azure Container Registry (ACR) is a managed Docker registry service that lets you store and manage your container images and artifacts. It serves as a private (or public) repository for storing and managing container images. ACR seamlessly integrates with other Azure services like Azure Kubernetes Service (AKS) and Azure DevOps, allowing developers to use container images stored in ACR for deployments. For this installation, we will make use of a registry. Follow the steps here to create a container registry.

Installation

First, create a Kubernetes cluster. You could use the UI or the Azure CLI to create the cluster. When creating the cluster, make sure that it belongs to the same resource group as the container registry. Here are links to Azure docs about how to create a cluster using each of the two methods:

Once a Kubernetes cluster has been created, connect to it. The following commands will help connect to the cluster:

az account set --subscription <enter_subscription_id_here>

az aks get-credentials --resource-group korifi-test_group --name korifi-test

Merged "korifi-test" as current context in /home/ram/.kube/config

In this case, a Kubernetes cluster named korifi-test, which belongs to the resource group korifi-test_group is added to the kubectl config.

Next, install the following dependencies: cert-Manager, kpack, and Contour.

Cert Manager is an open source certificate management solution designed specifically for Kubernetes clusters. It can be installed with a single kubectl apply command, with the latest release referenced in the path to the yaml definition.

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml

kpack is an open source project that integrates with Kubernetes to provide a container-native build process. It consumes cloud native Buildpacks to export OCI-compatible containers. kpack can be installed by using the kubectl apply command, by passing the yaml containing the declaration of latest release.

kubectl apply -f https://github.com/pivotal/kpack/releases/download/v0.11.0/release-0.11.0.yaml

Contour is an open source ingress controller for Kubernetes that is built on top of the Envoy proxy. An ingress controller is a Kubernetes resource that manages the inbound network traffic to services within a cluster. It acts as a gateway and provides external access to the services running inside the cluster. Contour specifically focuses on providing advanced features and capabilities for managing ingress in Kubernetes.

kubectl apply -f https://projectcontour.io/quickstart/contour.yaml

Upon installing Contour on the Kubernetes cluster, it will generate external-facing IP addresses. This will allow us to access the cluster. The Contour service will need to be queried to ascertain the IP address we are going to map for the ingress into the cluster. The following command will help with that:

kubectl get service envoy -n projectcontour -ojsonpath='{.status.loadBalancer.ingress[0]}'

The output from this command will be an IP address, e.g. {“ip”:”34.31.52.175″}, which will be used at various places as the base domain, suffixed with nip.io.

The installation requires a container registry to function. For this installation, we will be using the Azure Container registry. In order to access this container registry, first a token and a password have to be generated on the cloud.

Next, the same credentials will be used to create a secret and added to the cluster. The command for creating the registry credentials is as follows:

kubectl --namespace "cf" create secret docker-registry image-registry-credentials

--docker-server="<azure-registry-url>"

--docker-username="<azure-registry-token-name>"

--docker-password "<azure-registry-token-password>"

For this installation, the following values will have to be used.

Once the secret has been created, use the following Helm chart to install Korifi on the Azure cluster.

Note: We use nip.io as the suffix for the externally available IP address that can reach the cluster. Nip.io is a wildcard DNS provider.

Once installation is completed, use the cf cli to set the API endpoint and log to the cluster:

cf api https://api.20.232.66.98.nip.io --skip-ssl-validation

cf login

The following commands can be used to test the installation:

cf create-org flighty

cf target -o flighty

cf create-space -o flighty bird

cf target -o flighty -s bird

cf push mighty-monkey -p ~/sandbox/korifi/tests/smoke/assets/test-node-app/

This push should end with an application built using Paketo buildpacks and deploy on the remote Kubernetes cluster. This stub is the full output from the cf push process:

Conclusion

Azure Kubernetes Service (AKS) is an excellent choice for organizations looking to deploy containerized applications at scale with minimal management overhead and high reliability. It simplifies the process of setting up and maintaining Kubernetes clusters, making it an attractive option for developers and IT teams embracing containerization and cloud native application development.

For teams that want a mature platform to manage Azure Kubernetes clusters, Cloud Foundry Korifi is a great choice. It provides a first-class, multitenant experience over Kubernetes which is a critical, yet missing piece. Korifi can save many engineering hours of tinkering with internal development platforms and in the long run help save ops cycles too. It frees up app devs to focus on what is most important and yet remain safe, secure, and scalable. To get started with Korifi, visit the GitHub repository.

The post Install Cloud Foundry on Azure Kubernetes Clusters appeared first on The New Stack.

MySQL HeatWave Gets Generative AI and JavaScript, Slew of New Features

Andrew Brust — Thu, 21 Sep 2023 17:18:48 +0000

As the Oracle CloudWorld conference takes place in Las Vegas this week, Oracle‘s MySQL team is announcing a number of enhancements to the HeatWave platform that shore up its core functionality; add capabilities in the realm of generative AI; enhance support for the data lakehouse approach to analytics data management, autonomous operation, and in-database machine learning; and address core programmability and performance on the OLTP side, too.

Developer Goodies

The MySQL team briefed the media by starting on the analytics side, and leaving the developer-oriented features for last. As far as readers of The New Stack are concerned, I say they buried the lede, so I’m going to kick off with what the MySQL team left until last: goodies for developers including JSON acceleration and JavaScript-based stored procedures and functions.

JSON support in that base MySQL platform allows JSON data to be materialized in binary and text columns in tables or in virtual columns. It also allows JSON payloads to be passed to stored procedures and functions as arguments. MySQL supports use of its MongoDB API-compatible XDevAPI on the client side and numerous programming languages can be used in the MySQL shell to manipulate the JSON data on the input or output side. But now JSON data can be brought into HeatWave, where it is stored in binary format, partitioned, compressed up to 3x and scaled across nodes. The MySQL team says simple filter queries can be accelerated up to 20x, aggregation queries up to 22x and large join queries up to 144x.

Moving on from the JavaScript Object Notation format to the JavaScript language itself, stored procedures in HeatWave can now be coded in that language, in addition to the long-supported use of SQL. SQL is a declarative, set-based language, which can make it hard to perform more imperative tasks. JavaScript stored procs and functions eliminate this constraint and are called and used in exactly the same way as SQL-based ones, be it in queries, views, data manipulation language (DML) commands or data definition language (DDL) commands.

Data type conversions between the two languages are implemented implicitly. The JavaScript code executes in a GraalVM virtual machine, which provides for secure/sandboxed use of compute and memory, and which blocks direct network and file system access.

Lakehouse Enhancements

Now let’s move on to HeatWave’s lakehouse capabilities, as there are a few dimensions to it. First off, HeatWave is adding support for the Apache Avro data file format to its existing compatibility with CSV and Apache Parquet formats. The functionality includes support for multiple compression algorithms, across which the team says performance is consistent. Avro support also includes — via HeatWave’s “Autopilot” assistance feature — schema inference, cluster capacity estimation for data load operations, and a time estimate for same.

What’s key in this announcement is that HeatWave now supports an optimized data format for row-oriented data. Compare this with the unoptimized text-based CSV and the column-oriented Parquet format and you can see that Oracle’s MySQL team is paying attention to OLTP workloads, in addition to the analytical workload support that was HeatWave’s original hook. Meanwhile, that analytical side would benefit from support for the Delta, Iceberg and/or Hudi open table formats that build on top of the Parquet standard.

Next on the lakehouse side is support for HeatWave on the Amazon Web Services cloud. This means data in any of the three supported formats that any customer may already have in Amazon’s S3 object storage is now available for processing with HeatWave. Even though HeatWave itself runs in Oracle’s own AWS account, connectivity to data in the customer’s account is still provided. Adding S3 data to HeatWave can be done simply by providing an ENGINE = LAKEHOUSE clause in a CREATE TABLE command, and that command can itself be auto-generated by Autopilot, leveraging the schema inference we’ve already discussed.

AutoML Enhanced, Now Encompasses Generative AI

Moving on to the world of AI, HeatWave’s AutoML (automated machine learning) can leverage this S3 data access, including the new Avro support, to build machine learning models that reside in HeatWave and are trained on HeatWave data. HeatWave AutoML also supports recommendation models, beyond other AutoML platforms’ typical support for classification, regression, clustering/anomaly detection and time-series forecasting models.

With respect to competition, Oracle claims HeatWave’s training times are 25x faster than those for Amazon Redshift, with the implication that HeatWave is a better analytics database for AWS than AWS’ own data warehouse offering. And beyond Redshift, Snowflake’s SnowPark ML, provides a bridge to scikit-learn and doesn’t provide any built-in AutoML, according to the MySQL team.

There’s generative AI support in MySQL AutoML too, and it takes a couple of forms, including support for Large Language Models (LLMs) and a built-in vector store. On the LLM side, HeatWave can use BERT and Tfidf to generate embeddings from the content of text columns in the database and submit them to the AutoML engine, alongside numerical representations of data in conventional scalar data columns. From all these inputs, tuned models are produced.

Documents in object storage factor in as well, as vector embeddings for them can be stored and indexed in the HeatWave vector store. Together, these features lead to more contextual answers to generative AI queries, as data in the vector store can be used to augment the prompts sent to the LLM.

Autonomous Autopilot

Moving on to HeatWave’s Autopilot, which uses AI to implement autonomous operation, or assistance with advanced features, the team has added support for Autopilot indexing, auto unload, auto compression, and adaptive query execution. The last of these, according to the MySQL team, dynamically adjusts data structures and system resources even after query execution has begun, to accommodate the actual distribution of the data observed as the query engine encounters it. The MySQL team reports first-run performance improvement of between 10% and 25% as a result of adaptive query execution.

Autopilot indexing is a machine learning-driven service that recommends secondary indexes for OLTP workloads, and includes suggesting new indexes as well as pointing out superfluous (e.g. unused or duplicate) indexes that should be dropped. Autopilot indexing takes both queries and DML operations — like UPDATE, INSERT and DELETE — into account. The service also predicts both storage requirements and performance, and it provides explanations for its recommendations.

Auto load and unload moves data from a conventional MySQL database into and out of the HeatWave cluster, based on frequency of access, helping developers avoid performing these operations manually. Auto-column compression will mix and match compression algorithms on a per-column basis, finding the right balance between memory usage and performance. The company claims memory savings of between 6% and 25% and performance increases between 6% and 10%. The fact that there can be improvement on both the memory and perf axes, rather than making developers choose between them, is an impressive testimonial to the value of algorithmic optimization.

And More

Other capabilities include a bulk data ingest/load feature, partitioning, analytics functions, SET operations, and availability on multiple clouds (Amazon Web Services, Microsoft’s Azure and Oracle Cloud Infrastructure). These and all the other capabilities discussed here should ensure continued momentum for MySQL HeatWave that Oracle says it has seen in the digital marketing, gaming, healthcare and fintech sectors. This is a real smorgasbord of capabilities, demonstrating that Oracle views MySQL as a strategic asset in its portfolio. Does Oracle Database itself rule the roost? Maybe. But MySQL, with its decades-long ecosystem, its huge community, and its modular, pluggable engine architecture, has found new life in the cloud, in analytics, in machine learning, and now in generative AI.

The post MySQL HeatWave Gets Generative AI and JavaScript, Slew of New Features appeared first on The New Stack.

Oracle Introduces New App Analytics Platform, Enhances Analytics Cloud

Andrew Brust — Thu, 21 Sep 2023 13:47:03 +0000

At its Oracle CloudWorld conference in Las Vegas this week, Oracle is introducing a range of new analytics capabilities. In addition to its core Oracle Database, MySQL and MySQL HeatWave businesses, Oracle focuses on analytics and applications. As such, the new analytics capabilities it is announcing accrue to both its Oracle Analytics Cloud (OAC) platform as well as the value-added functionality for Oracle applications that run atop that platform.

A Full Data Intelligence Platform

It’s with respect to the latter that Oracle is announcing the new Fusion Data Intelligence Platform. This new service is an evolution of the Fusion Analytics platform that preceded it, but in addition to Fusion Analytics’ semantic models that are defined and materialized in Oracle Analytics Cloud, the new service includes 360-degree data models, analytic artifacts, AI and BI models and pre-built intelligent apps.

Those pre-built apps bring in data models, ML models and analytics, designed to be accessible to people who don’t currently use self-service BI, and prefer to stay a level of abstraction above it. Oracle demoed a “Supply Chain Command Center” application as an example. It was a full-blown browser-based application with BI and AI capabilities already implemented and built in.

External Data too, All in the Lakehouse

Like Fusion Analytics, Fusion Data Intelligence Platform is not an island. For example, it will allow the addition of external data and will link to the likes of Salesforce, LinkedIn, and other external services with business-relevant data. On the Oracle applications side, Fusion Data Intelligence Platform will tie into Oracle Netsuite, Oracle Health and Oracle Industries applications. Fusion Data Intelligence Platform also integrates with, and includes an instance of, OAC, which Fusion Analytics did as well.

All data will land in a single Oracle Cloud Infrastructure (OCI) data lakehouse with a semantic model, ML models, etc. and OAC tie-ins. Though the lakehouse will feature a single model, it will be broken into multiple “subject areas” for specific target audiences.

OAC Gets AI

It’s not only at the Fusion Data Intelligence Platform level where Oracle has added AI capabilities. After all, Fusion Data Analytics Platform is a layer above OAC, where Oracle has added AI capabilities as well.

OAC now has an Analytics Assistant, offering a chatbot interface on your data, with links to public data via ChatGPT. In partnership with Synthesia, the Assistant features avatars that can act as “news readers” to deliver data stories verbally to business decision-makers.

AI-Powered Document Understanding can scan JPEG and PDF files — and extract values and context. One example mentioned by Oracle, for applying this in practice, was the reading of individual receipt images to ensure their totals match the data in expense reports.

Narratives, Teams Integration, and the Business User Strategy

Contextual Insights implements natural language generation to provide narratives of users’ data. It’s similar in concept to Power BI’s Smart Narratives and Data Stories/narratives in Tableau. OAC now also integrates with Microsoft Teams, letting users bring OAC dashboards, visualizations, and insights into Teams channel chats. The functionality provided is similar to the previously introduced integration of OAC with Slack.

The range of capabilities added to Oracle’s Analytics platform should greatly benefit Oracle Applications customers. While customers might think of Power BI or Tableau when the subject of analytics comes up, Oracle is making it unnecessary to bring in third-party platforms when it comes to AI- and BI-driven insights on its applications’ data. Its goal is to go beyond self-service analytics and instead just surface analytics capabilities in business users’ tools. Clearly, Oracle is delivering in that area.

The post Oracle Introduces New App Analytics Platform, Enhances Analytics Cloud appeared first on The New Stack.

OpenAI Chats about Scaling LLMs at Anyscale’s Ray Summit

Richard MacManus — Tue, 19 Sep 2023 15:58:33 +0000

This week at Anyscale’s Ray Summit, a conference focused on LLMs and generative AI for developers, attention was turned to the business of scaling.

Robert Nishihara, the co-founder and CEO of Anyscale, opened the Ray Summit by warning that the LLM era was about to get even more complex and data-intensive than it already is. “Soon we’ll all be using multimodal models, working not just with text data, but also video and image data,” he said. “It’s going to become far more data intensive. On the hardware front, the variety of accelerators that we need to support will grow. On the application front, applications are becoming far more complex.”

Funnily enough, Anyscale has just the product to deal with this new layer of complexity. It already runs the open source platform, Ray, a distributed machine learning framework being used by OpenAI, Uber and others. But now it’s launching Anyscale Endpoints, which lets developers integrate, fine-tune and deploy open source LLMs at scale.

“This is an LLM API, an LLM inference API — like the OpenAI API, but for open models like Llama 2,” said Nishihara about Endpoints.

The cost for this will be $1 per million tokens. “That is the price point for the 70 billion parameter Llama model and that is the lowest price point on the market,” he claimed.

Anyscale Endpoints

Endpoints includes the ability to fine-tune an LLM, however for further customization customers will need to upgrade to the full Anyscale AI Application Platform, which the company says gives them “the ability to fully customize an LLM, and have fine-grained control over their data and models and end-to-end app architecture as well as deploy multiple AI applications on the same infrastructure.”

Still, being able to fine-tune an LLM via API is very useful for any application that doesn’t require massive scale.

Also announced was Anyscale Private Endpoints, which enables customers to run the service inside their own cloud.

A Sit down with OpenAI Co-Founder John Schulman

As well as the product announcements, Nishihara sat down with John Schulman, one of the founders of OpenAI and a creator of ChatGPT. After some initial chitchat, Nishihara brought up the issue of scale for OpenAI. “Where did that belief in the importance of scaling models and compute come from?” he asked.

“The founding team of OpenAI […] leans more towards this aesthetic of scale up simple things, rather than trying to build some complicated clever thing,” Schulman replied. He then made the point that scaling in machine learning is more complicated than many people realize.

“There [are] usually all these little details, like you have to scale your learning rates just right — otherwise, you get worse results with big models — and you have to scale your data up along with the model size. So I’d say that it took several years to figure out what were the right recipes for scaling things.”

John Schulman, one of the co-founders of OpenAI and a creator of ChatGPT.

To tease out more about OpenAI’s approach to scaling, Nishihara asked, “What’s stopping [you] from using, you know, 70 trillion parameter models today, or even bigger?”

“It’s about compute efficiency,” Schulman replied. “So now we know you can train a small model for really long, or a big model for short, and there’s some trade-off — and it turns out that somewhere in the middle, you get the best compute efficiency.”

Schulman noted that this is likely to change, but for now, a 70 trillion parameter model isn’t optimal.

Nishihara later brought up that OpenAI is “pushing the limits at OpenAI of scale, in a lot of different dimensions” and asked Schulman about its infrastructure. Obviously, it was a leading question, since OpenAI uses Anyscale’s Ray system to do distributed computing. Even so, it was interesting to hear further details about how OpenAI operates.

“We have a library for doing distributed training and it does model parallelism,” Schulman explained. “So you’re sending around weights and gradients and activations, and […] we use Ray as a big part of that for, for doing all the communication.”

To end the discussion, Nishihara made an interesting observation about the state of AI a decade ago. “Looking back a decade ago, problems like unsupervised learning were not that well understood,” he said. “Or perhaps we didn’t know how to conceptualize the problem.” He asked Schulman what are the problems today “that we’re still figuring out how to formulate?”

Schulman first mentioned “data accuracy,” a nod to the hallucination problem that everyone talks about with LLMs. But then he offered a more nuanced view.

“So there’s this problem of how […] do you supervise a model that’s kind of superhuman,” he said, adding that “sometimes this is called scalable oversight or scalable supervision.”

Ultimately, he continued, the supervision issue boils down to how to make sure LLMs are doing what humans want. However, in this case, “some of the problems haven’t even been formulated precisely yet.”

The post OpenAI Chats about Scaling LLMs at Anyscale’s Ray Summit appeared first on The New Stack.

Google Continues Expansion of AI in Workspace, Dev Tools

Chris J. Preimesberger — Wed, 30 Aug 2023 15:47:11 +0000

One could capsulize the dozens of news announcements made Aug. 29 on Day 1 of Google Cloud Next 23 this way: Business users and software developers alike will be able to do a lot more work in shorter time frames by simply writing simple queries, then pointing and clicking in the company’s AI-augmented workplace and development apps.

It’s all about the application understanding a user’s intentions and being prepared ahead of time as to what the use case entails. Google Next 23, the first in-person Cloud Next since before the pandemic in 2019, was expected to draw more than 20,000 attendees to the Moscone Center in San Francisco this week.

“We are in an entirely new era of cloud, fueled by generative AI,” Google Cloud CEO Thomas Kurian told a packed opening day audience. “Our focus is on putting gen AI tools into the hands of everyone across the organization — from IT, to operations, to security, to the board room.”

Here are product and services highlights of what the world’s largest and most successful search provider revealed at the conference.

Duet AI in Workspace

Duet AI in both the line-of-business workplace and in developer teams seemed to make the most news.

Google Cloud introduced for general availability Duet AI in Workspace, which makes the Workspace app come alive, at its I/O conference in May and described how it was going to work in theory. Today the company showcased exactly what it does, demonstrating new features that create the right type of document for a particular use case, connect and integrate disparate articles and blogs, and find appropriate images. It can even provide suggestions when the user is stuck on a problem.

For example: An employee wants to create a new marketing presentation for a client’s company. Starting in Google Drive with the enclosed Duet AI app, the user can prompt for the type of doc (say, Slides), enter the name of the client and search for any images or charts, forms or PDFs connected with the client. Duet AI automatically gathers all the elements and creates a multipage presentation — complete with headlines, text, images and charts — within a few minutes. Provided with a sophisticated-looking prototype, the user then can edit, rearrange and change anything needed to finalize the project. Having a head start like that obviously can save hours or days of time.

Other new options include:

Duet AI in Google Meet: Duet AI can take notes during video calls, send meeting summaries and even automatically translate captions in 18 languages. Duet AI in Meet also unveiled studio look, studio lighting, and studio sound.
Duet AI in Google Chat: Users can chat directly with Duet AI to ask questions about their content, get a summary of documents shared in a space and catch up on missed conversations.

Google Workspace VP Aparna Pappu said on stage that her team is working on enhancing Duet “so that we can go from a prompt-based interaction to a much richer contextual interaction that takes into account what you’re working on. Whether it’s an email, a document or a spreadsheet, it will offer you proactive help, such as generating summaries or suggesting creative ideas. Soon it will even take action on your behalf.”

Duet AI in Google Cloud for Developers

The company previewed the capabilities of Duet AI in Google Cloud, with general availability coming later this year. Beyond Workspace, Pappu said, Duet AI will provide AI assistance as a coding assistant to help developers code faster, as an expert adviser to help operators troubleshoot application and infrastructure issues, as a data analyst to provide quick and better insights, and as a security adviser to recommend best practices to help prevent cyber threats.

Pappu said that Duet AI in Google Cloud is designed to serve as a trusted tool for developers in software development, infrastructure ops, data analytics, security ops and databases. In software development, Duet AI will enable developers to stay in flow state longer by minimizing context switching to help them be more productive. In addition to code completion and code generation, it can assist with code refactoring and building APIs using simple natural language prompts.

In infrastructure, operators can chat with Duet AI in natural language directly in the Google Cloud Console to retrieve “how to” information about infrastructure configuration, deployment best practices, and expert recommendations on cost and performance optimization, Pappu said. In data analytics, Duet AI in BigQuery provides contextual assistance for writing SQL queries as well as Python code, generates full functions and code blocks, auto-suggests code completions and explains SQL statements in natural language, and can generate recommendations based on specific schema and metadata.

In security operations, Duet AI will be coming later this year to security products that include Chronicle Security Operations, Mandiant Threat Intelligence, and Security Command Center.

Google said that user code, user inputs to Duet AI and recommendations generated by Duet AI will not be used to train any shared models nor used to develop any products.

Vertex AI Gets Upgrade

Vertex AI is a development platform used to build, deploy and scale machine learning models. Users currently have access to more than 100 foundation models — including third-party and open source versions — as well as industry-specific models such as Sec-PaLM 2 for cybersecurity and Med-PaLM 2 for healthcare and life sciences. New additions announced today include:

Vertex AI Search and Conversation: The tools now enable dev teams to create search and chat applications using their own data, with minimal coding and enterprise-grade management and security built in.
PaLM 2, Imagen and Codey upgrades: This includes updating PaLM 2 to 32k context windows so enterprises can process longer-form documents, such as research papers and books.
New models: Google Cloud is announcing the availability of Llama 2 and Code Llama from Meta, and Technology Innovative Institute’s open source Falcon LLM. It also unveiled Claude 2 from Anthropic. Google said it will be the only cloud provider offering both adapter tuning and RLHF for Llama 2.

Simplified Analytics at Scale

New capabilities in Google’s Data and AI Cloud are designed to boost productivity for data teams. Announcements included a new BigQuery Studio interface for data engineering, analytics, and predictive analysis; and AlloyDB AI, which offers an integrated set of capabilities for building GenAI apps, including high-performance and vector queries. Google also announced that several of its new Data Cloud partners, including Confluent, DataRobot, Dataiku, DataStax, Elastic, MongoDB, Neo4j, Redis, SingleStore and Starburst, are all launching new capabilities aimed to enhance gen AI development with data.

Addressing Security Issues

Google Cloud claims to be the only security provider that brings together frontline intelligence and expertise, a modern AI-infused security operations platform and a trusted cloud foundation. At the conference, the company introduced Mandiant Hunt for Chronicle, which integrates the latest insights into attacker behavior from Mandiant’s frontline experts with Chronicle Security Operations’ ability to analyze and search security data.

The company also announced agentless vulnerability scanning, which controls management capabilities in Security Command Center to detect operating system, software and network vulnerabilities on Compute Engine virtual machines.

The post Google Continues Expansion of AI in Workspace, Dev Tools appeared first on The New Stack.

Microsoft PowerShell Gallery Littered with Critical Vulnerabilities

Steven J. Vaughan-Nichols — Wed, 30 Aug 2023 10:00:46 +0000

If you give a hoot about code security, you already know that popular code-package managers and repertories, such as Node Package Manager (npm) and Python Package Index (PyPI), are overstuffed with vulnerabilities and the malware that goes with them. What none of us knew is that PowerShell Gallery, Microsoft’s central repository for sharing PowerShell code, including PowerShell modules, scripts and Desired State Configuration resources, has the same kind of problems. That’s what Aqua Security’s Aqua Nautilus found when they checked the Gallery’s policies and discovered numerous serious security risks.

Now, you may have thought that that would only be a worry for Windows shops, where PowerShell is the default command line interface. You’d be wrong. The PowerShell Gallery’s code is primarily used to manage cloud resources not just for Azure, but for other major cloud vendors such as Amazon Web Services (AWS).

These flaws make PowerShell code susceptible to typosquatting attacks. This can lead to the inadvertent installation of malicious modules, which could be catastrophic for organizations, given PowerShell Gallery modules’ wide adoption in the cloud deployment process.

But wait, there’s more!

Attackers can also exploit vulnerabilities to detect unlisted packages and expose deleted secrets.

Specifically, Aqua uncovered the following flaws:

Lax Naming Policy: Unlike stringent naming policies in other package managers like, ironically enough, troubled npm, PowerShell Gallery lacks protection against typosquatting. For example, while most Azure-related packages follow the “Az.” pattern, not all do. Attackers can spoof genuine-looking modules, potentially running malicious code on unsuspecting users’ systems.
Authorship Spoofing: The landing pages of PowerShell modules can be manipulated to display fake details. The only credible details available to users, the download count and the last published date, can be easily manipulated.
Exposing Unlisted Modules: Despite Microsoft’s official documentation suggesting that unlisted packages in PowerShell Gallery remain hidden from public view, Aqua Nautilus’s research points otherwise. They could access both listed and unlisted packages and their respective versions.

Is it really that bad? Yes, it is. Aqua Nautilus created a package, “Az.Table,” imitating the highly popular “AzTable” package. When downloaded, the mimic package could gather metadata. Need I say more? putting light on the potential harm malicious entities could inflict using these vulnerabilities.

Surprisingly, even after these vulnerabilities were reported to Microsoft’s Security Response Center (MSRC) twice, there’s been no significant rectification. The problem was first reported on September 27, 2022, and then again on January 3, 2023. Both times, MSRC confirmed the flaws but claimed they’ve been fixed. They haven’t been. As of August 2023, Aqua Nautilus could still reproduce the issues.

You’d think Microsoft, with all its resources, would be far more proactive than the comparatively poor npm and PyPI. It seems you’d be wrong.

First and foremost, Aqua demands that Microsoft Fix The Problem. This could include implementing a strict package naming policy, verifying authorship, restricting access to unlisted packages, and improving the visibility of package ownership.

I agree. Microsoft must take this seriously before it blows up in a customer’s face and then on their own. With Microsoft’s recent major security snafus, they can’t afford

But, what can we do in the meantime? Aqua suggests:

Use Signed PowerShell Module Policy: Enforce a policy that only allows the execution of signed scripts. This ensures that any script or module, including those downloaded from the PowerShell Gallery, must be digitally signed with a trusted certificate before they can be run, providing an additional layer of security against the execution of malicious scripts.
Use Trusted Private Repository: This can ensure that the repository has limited internet access and user access, where you can manage and consume your private modules while also storing modules from the public PowerShell Gallery in a more secure way.
Regularly Scan for Sensitive Data: This includes scanning the modules’ source code for secrets and conducting regular security assessments in the repositories that store and manage the module’s code. It’s important to promptly address and rotate any exposed secrets to prevent exploitation by attackers.
Detecting Suspicious Behavior in Cloud Environments: Implement a robust continuous monitoring system that tracks activities in real-time across your CI/CD pipelines and cloud infrastructure. This proactive approach allows you to detect potential threats and suspicious behavior, it is also capable of detecting any deviations from established normal profiles.

Yes, that’s a lot of work. But, consider the potentially disastrous alternatives. I think you’ll agree that this is one time when security safety trumps short-term savings.

The post Microsoft PowerShell Gallery Littered with Critical Vulnerabilities appeared first on The New Stack.

Simplifying Cluster Connectivity with Istio Service Mesh

Stephan Benny — Wed, 23 Aug 2023 12:32:26 +0000

This is the first in a two-part series.

Multicluster service connectivity is becoming essential in modern distributed applications and cloud native environments. Some of the key reasons organizations require multicluster service connectivity include:

Microservices and scaling: In microservices architectures, services are broken down into smaller, manageable components. Multicluster service connectivity allows deploying microservices independently in different clusters, facilitating horizontal scaling and simplifying application management.

Geographic distribution: Multicluster service connectivity allows the distribution of applications and services across different virtual private clouds (VPCs), regions or data centers, reducing latency and providing better performance for users in various geographical locations.

High availability and redundancy: Connecting services across multiple clusters provides high availability and redundancy. If one cluster goes down due to maintenance or unexpected issues, the services can seamlessly fail over to another cluster, ensuring continuous service availability.

Load balancing and traffic distribution: By distributing traffic across multiple clusters, organizations can balance the load on individual clusters, preventing overloading and ensuring optimal performance.

Specialized services: Access to specialized services is one of the significant advantages of adopting a multicloud strategy. It allows organizations to leverage unique and specialized services provided by different cloud providers, tailoring their solutions to meet specific business needs.

Cost optimization: Organizations can optimize their cloud spending by selecting cost-effective specialized services from different providers. Based on workload demands, they can take advantage of price differences, spot instances and reserved instances.

Flexibility and agility: Multicluster service connectivity provides the flexibility to deploy applications in diverse environments, supporting various development and testing workflows and allowing faster experimentation and innovation.

Because of the above reasons, running large applications spanning multiple cloud regions or sometimes across different cloud providers has become a common practice.

What Is a Service Mesh?

Service mesh is a dedicated infrastructure layer that handles service-to-service communication within a distributed application. It is particularly prevalent in cloud native environments, where applications are built using a microservices architecture. It provides a set of functionalities and capabilities that enhance the connectivity, security and observability of microservices-based applications.

Service mesh has become the de facto standard for connecting multicluster services due to its ability to address the challenges and complexities associated with microservices architectures and multicluster environments. Here are some key reasons service mesh emerged as the standard solution for multicluster service connectivity:

Microservices architecture: Service mesh provides a dedicated layer for handling service-to-service communication, offering features like load balancing, service discovery and routing, making it ideal for microservices-based applications.

Network complexity: In multicluster environments, managing network connectivity between clusters, especially in different cloud providers or data centers, can be daunting. Service mesh abstracts away this complexity, providing a consistent and unified approach to managing service communication across clusters.

Consistent service-to-service communication: Service mesh ensures uniform connectivity between services, regardless of location or the underlying infrastructure. This consistent communication pattern is crucial for multicluster setups, enabling seamless interactions between services running in different clusters.

Security and encryption: In multicluster environments, securing communication between services becomes critical. Service mesh solutions often offer built-in security features like mutual TLS encryption, authentication and authorization, ensuring secure communication channels between services across clusters.

Observability and monitoring: Monitoring and debugging applications in multi-cluster environments can be challenging due to the distributed nature of the infrastructure. Service mesh platforms typically provide powerful observability tools, such as logging, tracing and metrics, allowing comprehensive monitoring of service-to-service communication across clusters.

Vendor neutrality: Service mesh solutions are typically cloud-agnostic and support various Kubernetes-based environments. This vendor neutrality will enable organizations to implement multicluster service connectivity without being locked into a specific cloud provider.

Community adoption and ecosystem: Service mesh technology, particularly solutions like Istio and Linkerd, has gained widespread adoption with an active community and ecosystem. The availability of documentation, tutorials and community support makes it easier for organizations to adopt and integrate service mesh into their multicluster architectures.

Continuous evolution and improvement: Service mesh technologies continue to evolve and improve, with regular updates, new features and performance enhancements being introduced. This ongoing development ensures that service mesh remains relevant and capable of addressing the evolving needs of multicluster environments.

Industry standards and best practices: As service mesh adoption has grown, it has become a recognized industry standard and best practice for connecting multicluster services. Industry leaders and cloud native organizations widely endorse and promote the use of service mesh to address the challenges of multicluster connectivity.

The traffic management, security and observability capabilities of service mesh make it a compelling choice for organizations seeking to harness the benefits of multicloud and hybrid-cloud architectures.

Key Considerations for Setting Up a Multicloud/Multicluster Istio Environment

Setting up a multicluster service mesh involves several steps to ensure seamless communication between services across Kubernetes clusters. Below are prerequisites and several key considerations when setting up a multicluster service mesh using the popular service mesh platform Istio.

Prerequisites:

Kubernetes clusters: You need at least two Kubernetes clusters in different environments (different cloud providers, on-premises or hybrid).
Kubernetes cluster access: Ensure you have access and the necessary permissions to manage resources in each cluster.
Istio installation: Install Istio on each cluster. Follow the official Istio documentation for the installation steps.

Key Considerations:

Configure trust and certificates: Establish trust between the Kubernetes clusters to enable secure communication between the clusters. This typically involves setting up certificates and keys for mutual TLS authentication between the clusters.

Enable cross-cluster communication: Ensure that the Kubernetes clusters can communicate with each other over the network. This may require configuring firewalls, network policies or load balancers to allow traffic between the clusters.

Configure Istio control plane: Set up the Istio control plane on each cluster. The control plane manages and configures the Istio components, including sidecar proxies, across the clusters.

Configure sidecar proxies: Deploy sidecar proxies (Envoy) alongside the services in each cluster. Sidecar proxies intercept and manage the traffic to and from the services.

Configure service discovery: Configure service discovery to enable services in one cluster to discover and communicate with services in other clusters. This might involve exposing the Kube API server across networks so the Istio control plane can perform service discovery.

Configure traffic routing: Define traffic routing rules to control how requests are routed between services in different clusters. Istio’s traffic management features, such as VirtualServices and DestinationRules, can be used for this purpose.

Configure load balancing and failover: Configure load balancing and failover mechanisms to ensure that traffic is efficiently distributed among service instances in different clusters and that services can fail over to other clusters if needed.

Configure security: Set up Istio’s security features, such as mutual TLS authentication and authorization policies, to secure communication between services across clusters.

Monitor and observe: Use Istio’s observability features, such as distributed tracing and metrics, to monitor the health and performance of the multicluster service mesh.

Test and verify: Thoroughly test the setup to ensure that services in different clusters can communicate seamlessly and that traffic is routed correctly.

Continuous maintenance and updates: Regularly maintain and update the multicluster service mesh to keep it secure, performant and aligned with the evolving needs of the applications and clusters.

Challenges

It’s important to note that setting up a multicluster service mesh can be complex, and the exact steps can vary depending on the service mesh platform and your specific environment. Here are some key challenges involved in setting up and maintaining multicluster service mesh:

Consistent configuration: Ensuring consistent configuration across multiple clusters is crucial for the proper functioning of the service mesh.

Network connectivity: A vital step, establishing network connectivity requires setting up secure communication channels, often across public or hybrid cloud environments. Dealing with network infrastructure, firewalls and security policies can introduce challenges in establishing and maintaining connectivity between clusters.

Service discovery: Ensuring that services in one cluster can discover and communicate with services in other clusters requires careful configuration and coordination.

Monitoring and troubleshooting: Monitoring and troubleshooting can be complex due to the increased number of components and the distributed nature of the infrastructure.

To address these challenges, adopting Infrastructure as Code (IaC) approaches for configuration management and automation tools for consistent deployments is recommended. At Rafay, we have also developed an open source CLI tool to simplify the configuration.

The second part of this blog series will share a reference design and example configuration of a multicluster Istio service mesh deployment as well as more details on the open source CLI tool.

The post Simplifying Cluster Connectivity with Istio Service Mesh appeared first on The New Stack.

How SaaS Companies Can Monetize Generative AI

Puneet Gupta — Fri, 18 Aug 2023 17:00:17 +0000

You’ve already been part of a conversation at your company, either as a contributor or observer, on how your customers can benefit with an increased value from your products infused with Generative AI, LLMs or custom AI/ML models.

Universally, product roadmaps are being upended to incorporate AI. As you hash out your approach and draw up the enhanced roadmap, I want to share some words of advice from the good ol’ California Gold Rush: Don’t show up to the gold rush without a shovel!

Similarly, don’t overlook the monetization aspect of your SaaS and AI. Factor it in at the outset and integrate the right plumbing at the start — not as an afterthought or post-launch.

What’s Changing? SaaS Is Shifting to Metered Pricing

Two years ago, I wrote about the inevitable shift to metered pricing for SaaS. The catalyst that would propel the shift at the time was unknown, but the foundational thesis outlined that it was inevitable. No one could have predicted in 2021 that a particular form of AI would serve to be that catalyst.

First thing to realize is that this is not merely a “pricing” change. It is a monetization model change. A pricing change would be a change in what you charge, for example, going from $79 per user/month to $99 per user/month. A monetization model change is a fundamental shift in how you charge, which inevitably will also change what you charge. It’s a business model change.

Traditionally, SaaS pricing has been a relatively lightweight exercise, often decoupled from product or product teams. With a per-user or per-seat model, as long as the price point was set sufficiently (and in some cases arbitrarily) high above a certain threshold that covered for underlying costs with the desired margin, that’s all that was needed. It was essentially a one-size-fits-all approach requiring almost no need for usage instrumentation or product usage tracking and reporting.

SaaS and AI Pivots This on Its Head

Your technology stack increasingly will have more third party value-add components of AI/ML, further infused with additional custom models layered on top. You are going to operate in a multi-vendor business tier (not just infrastructure) ecosystem. These new value-added business tier components in the form of AI/ML in turn will come with a usage-based pricing and charge model. See ChatGPT pricing.

Each user of your SaaS application will stretch and use these metered components in different ways, thereby propelling you to also charge on a metered basis to align with underlying costs and revenue.

Deploy a Proven and Scalable Approach

While on the surface it may seem daunting, believe me, this is a welcomed change. Lean into it.

Not only will it enable you to provide your customers with flexible and friendly consumption-based pricing, but it will also drive a level of operational efficiency and discipline that will further contribute to your bottom line.

Start with de-coupled metering, and then layer a usage-based pricing plan on top. For example, Stripe leverages GPT-4 from OpenAI to enrich the customer-facing experience in its documentation. Instacart has also integrated with ChatGPT to create an Ask Instacart service. The app will allow users to research food-related queries in a conversational language such as healthy meal formulations, recipe ideas based on given ingredients and generated shopping lists based on the ingredients of a particular recipe.

Beyond integrating with ChatGPT and other services, traditional software companies are developing their own GenAI technologies as well. For example, Adobe has rolled out Adobe Firefly to offer its own text- and image-generation capabilities to creatives.

As these capabilities become natively integrated and expected by customers, it will be imperative to track usage and develop a flexible, transparent pricing model that scales to all levels of consumption.

Usage-Based Pricing Is a Natural Fit for Generative AI Companies

Generative AI and Usage-Based Pricing: A Complimentary Pair

ChatGPT parses the text prompt to generate an output based on the “understanding” of that prompt. The prompts and outputs vary in length where the prompt/output size and resource consumption are directly related, with a larger prompt requiring greater resources to process and vice versa. Additionally, the usage profile can be expected to vary significantly from customer to customer. One customer may only use the tool sparingly, while another could be generating new text multiple times daily for weeks on end, and the pricing model must account for this variability.

On top of this, services like ChatGPT are themselves priced according to a usage-based model. This means that any tools leveraging ChatGPT or other models via API will be billed based on the usage; since the backend costs of providing service are inherently variable, the customer-facing billing should be usage-based as well.

To deliver the most fair and transparent pricing, and enable frictionless adoption and user growth, these companies should look to usage-based pricing with a product-led go-to-market motion. Having both elastic frontend usage and backend costs position generative AI products as ideal fits with a usage-based and product-led approach.

How to Get Started

Meter frontend usage and backend resource consumption

Rather than building these models from scratch, many companies elect to leverage OpenAI’s APIs to call GPT-4 (or other models), and serve the response back to customers. To obtain complete visibility into usage costs and margins, each API call to and from OpenAI tech should be metered to understand the size of the input and the corresponding backend costs, as well as the output, processing time and other relevant performance metrics.

By metering both the customer-facing output and the corresponding backend actions, companies can create a real-time view into business KPIs like margin and costs, as well as technical KPIs like service performance and overall traffic. After creating the meters, deploy them to the solution or application where events are originating to begin tracking real-time usage.

Track usage, margins and account health for all customers

Once the metering infrastructure is deployed, begin visualizing usage and costs in real time as usage occurs and customers leverage the generative services. Identify power users and lagging accounts and empower customer-facing teams with contextual data to provide value at every touchpoint.

Since generative AI services like ChatGPT use a token-based billing model, obtain granular token-level consumption information for each customer using your service. This helps to inform customer-level margins and usage for AI services in your products, and it is valuable intel going into sales and renewal conversations. Without a highly accurate and available real-time metering service, this level of fidelity into customer-level consumption, costs and margins would not be possible.

Launch and iterate with flexible usage-based pricing

After deploying meters to track the usage and performance of the generative AI solution, the next step is to monetize this usage with usage-based pricing. Identify the value metrics that customers should be charged for. For text generation this could be the word count or the total processing time to serve the response; for image generation it could be the size of the input prompt, the resolution of the image generated or the number of images generated. Commonly, the final pricing will be built from some combination of multiple factors like those described.

After creating the pricing plan and assigning to customers, real-time usage will be tracked and billed. The on-demand invoice will be kept up-to-date so at any time both the vendor or customers can view current usage charges.

Integrate with your existing tools for next-generation customer success

The final step once metering is deployed and the billing service is configured is to integrate with third-party tools inside your organization to make usage and billing data visible and actionable. Integrate with CRM tooling to augment customer records with live usage data or help streamline support ticket resolution.

With real-time usage data being collected, integrate this system with finance and accounting tools for usage-based revenue recognition, invoice tracking and other tasks.

Amberflo for Generative AI

Amberflo provides an end-to-end platform for customers to easily and accurately meter usage and operate a usage-based business. Track and bill for any scale of consumption, from new models in beta testing up to production-grade models with thousands of daily users. Amberflo is flexible and infrastructure-agnostic to track any resource with any aggregation logic.

Build and experiment with usage-based pricing models, prepaid credits, hybrid pricing or long-term commitments to find the best model and motion to suit any unique business and customer base. Leverage real-time analytics, reporting and dashboards to stay current on usage and revenue, and create actionable alerts to receive notifications when key thresholds or limits are met.

The post How SaaS Companies Can Monetize Generative AI appeared first on The New Stack.

The Architect’s Guide to Thinking about Hybrid/Multicloud

Ugur Tigli — Fri, 18 Aug 2023 15:22:18 +0000

Recently, a journalist asked us to help frame the challenges and complexities of the hybrid cloud for technology leaders. While we suspect many technologists have given this a fair amount of thought, we also know from first-hand discussions with customers and community members that this is still an area of significant inquiry. We wanted to summarize that thinking into something practical, expanding where appropriate and becoming prescriptive where it was necessary.

We’ll start by saying that the concepts of the hybrid cloud and the multicloud are difficult to unbundle. If you have a single on-premises private cloud and a single public cloud provider, doesn’t that qualify you as a multicloud? Not that anyone really has just two. The team at Flexera does research every year on the subject and found that 87% of enterprises consider themselves multicloud, with 3.4 public clouds and 3.9 private clouds on average, although these number are actually down a touch from last year’s report:

There is a legitimate question in there: Can you have too many clouds?

The answer is yes. If you don’t design things correctly, you can find yourself in the dreaded “n-body problem” state. This is a physics term that was co-opted for software development. In the context of multiple public clouds, the “n-body problem” refers to the complexity of managing, integrating and coordinating these clouds. In an n-cloud environment, each cloud service (Amazon Web Services (AWS) , Azure, Google Cloud, etc.) can be seen as a “body.” Each of these bodies has its own attributes like APIs, services, pricing models, data management tools, security protocols, etc. The n-body problem in this scenario would be to effectively manage and coordinate these diverse and often-complex elements across multiple clouds. A few examples include interoperability, security and compliance, performance management, governance and access control and data management.

As you add more clouds (or bodies) into the system, the problem becomes exponentially more complex because the differences between the clouds aren’t just linear and cannot be extrapolated from pairwise interactions.

Overcoming the n-body problem in a multicloud environment requires thoughtful architecture, particularly around the data storage layers. Choosing cloud native and, more importantly, cloud-portable technologies can unlock the power of multicloud without significant costs.

On the other hand, can there be too few clouds? If too few equals one, then probably. More than one, and you are thinking about the problem in the right way. It turns out that too few clouds or multiple clouds with a single purpose (computer vision or straight backup for example) deliver the same outcome — lock-in and increased business risk.

Lock-in reduces optionality, increases cost and minimizes the firm’s control over its technology stack, choice within that cloud notwithstanding (AWS, for example, has over 200 services and more than 20 database services). Too few clouds can also create business risk. AWS and other clouds go down several times a year. Those outages can bring a business to a standstill.

Enterprises need to build a resilient, interchangeable cloud architecture. This means application portability and data replication such that when a cloud goes down, the application can fail over to the other cloud seamlessly. Again, you will find dozens of databases on every public and private cloud — in fact some of them aren’t even available outside of the public cloud (see Databricks). That is not where the problem exists in the “too few clouds” challenge.

As you add more clouds into the system, the problem becomes exponentially more complex because the differences between the clouds aren’t just linear.

The data layer is more difficult. You won’t find many storage options running on AWS, GCP, Azure, IBM, Alibaba, Tencent and the private cloud. That is the domain of true cloud native storage players — those that are object stores, software-defined and Kubernetes native. In an ideal world AWS, GCP and Azure would all have support for the same APIs (S3), but they don’t. Applications that depend on data running on one of these clouds will need to be redesigned to run on another. This is the lock-in problem.

The key takeaway is to be flexible in your cloud deployment models. Even the most famous “mono-cloud” players like Capital One have significant on-premises deployments — on MinIO in fact. There is no large enterprise that can “lock onto” one cloud. The sheer rigidity of that would keep enterprises from buying companies that are on other clouds. That is the equivalent of cutting off one’s nose to spite one’s face.

Enterprises must be built for optionality in the cloud operating model. It is the key to success. The cloud operating model is about RESTful APIs, monitoring and observability, CI/CD, Kubernetes, containerization, open source and Infrastructure as Code. These requirements are not at odds with flexibility. On the contrary, adhering to these principles provides flexibility.

So What Is the Magic Number?

Well, it is not 42. It could, however, be three. Provided the enterprise has made wise architectural decisions (cloud native, portable, embracing Kubernetes), the answer will be between three and five clouds with provisions made for regulatory requirements that dictate more.

Again, assuming the correct architecture, that range should provide optionality, which will provide leverage on cost. It will provide resilience in the case of an outage, it will provide richness in terms of catalog depth for services required, and it should keep the n-body problem manageable.

What about Manageability?

While most people will tell you complexity is the hardest thing to manage in a multicloud environment, the truth is that consistency is the primary challenge. Having software that can run across clouds (public, private, edge) provides the consistency to manage complexity. Take object storage. If you have a single object store that can run on AWS, GCP, Azure, IBM, Equinix or your private cloud, your architecture becomes materially simpler. Consistent storage and its features (replication, encryption, etc.) enable the enterprise to focus on the application layer.

Consistency creates optionality, and optionality creates leverage. Reducing complexity can’t come at some unsustainable cost. By selecting software that runs across clouds (public, private, edge) you reduce complexity and you increase optionality. If it’s cheaper to run that workload on GCP, move it there. If it’s cheaper to run that database on AWS, run it there. If it’s cheaper to store your data on premises and use external tables, do that.

Choose software that provides a consistent experience for the application and the developer and you will achieve optionality and gain leverage over cost. Make sure that experience is based on open standards (S3 APIs, open table formats).

Bespoke cloud integrations turn out to be a terrible idea. As noted, each additional native integration is an order of magnitude that’s more complex. Think of it this way: If you invest in dedicated teams, you are investing $5 million to $10 million per platform, per year in engineers. That doesn’t account for the cost of the tribal knowledge for each cloud. In the end, it results in buggy, unmaintainable application code. Software-defined, Kubernetes-centric software can solve these problems. Make that investment, not the one in bespoke cloud integrations.

What We Fail to See …

As IT leaders, we often deal with what is in front of us: shadow IT or new M&A integrations. Because of this we often fail to create the framework or first principles associated with the overall strategy. Nowhere is this more evident than the cloud operating model. Enterprises need to embrace first principles for containerization, orchestration, RESTful APIs like S3, automation and the like. Those first principles create the foundation for consistency.

IT leaders who attempt to dictate Cloud A over Cloud B because they get a bunch of upfront credits or benefits to other applications in the portfolio are suckers in the game of lock in that companies like Oracle pioneered, but the big three have mastered.

Going multicloud should not be an excuse for a ballooning IT budget and an inability to hit milestones. It should be the vehicle to manage costs and accelerate the roadmap. Using first cloud operating model principles and adhering to that framework provide the context to analyze almost any situation.

Cloud First or Architecture First?

A client asked us the other day if we recommend dispersing clouds among multiple services. It took us a moment to understand the question because it was the wrong one. The question should have been, “Should I deploy multiple services across multiple clouds?”

The answer to that is yes.

Snowflake runs on multiple clouds. Hashicorp Vault runs on multiple clouds. MongoDB runs on multiple clouds. Spark, Presto, Flink, Arrow and Drill run on multiple clouds. MinIO runs on multiple clouds.

Pick an architecture stack that is cloud native, and you will likely get one that is cloud-portable. This is the way to think about hybrid/multicloud.

The post The Architect’s Guide to Thinking about Hybrid/Multicloud appeared first on The New Stack.

A Middle Path for Data Sovereignty: Bring Your Own Cloud

Doug Flora — Fri, 18 Aug 2023 13:32:55 +0000

Emerging requirements for data sovereignty are driving an evolution in cloud deployment topologies. A new approach, known as Bring Your Own Cloud (BYOC), melds the control, compliance and data sovereignty benefits of self-hosting with the operational agility gained through fully managed SaaS offerings.

“Data sovereignty” is the notion that corporate data is subject to the laws and governance of the nation where data is collected, stored and processed. More than 100 countries have data sovereignty laws in place.

Organizations running services in the cloud are often subject to these data sovereignty requirements; however, traditionally it has been difficult if not impossible to be sure that cloud services store data in only a particular region.

For many years, organizations operating in the cloud have focused on data privacy in isolation, seeking to comply with a range of regulations — for example, GDPR in the European Union and the Health Insurance Portability and Accountability Act (HIPAA) in the United States.

As it turns out, data sovereignty is much more complex than data privacy alone. Privacy can be achieved in a straightforward manner, by relying on policy, which enables a declarative approach to deleting, masking, obfuscating and indexing sensitive data. Such an approach is commonly employed to protect predefined personally identifiable information (PII).

Data sovereignty, on the other hand, can be achieved only when the responsible organization controls the life cycle of the hard drives where data resides. There is no middle ground and no debate — data either resides on hard drives under your control or it does not. For this reason, it would seem that the only path to data sovereignty is self-hosted, single-tenant cloud deployments.

An even more extreme solution, “cloud repatriation,” moves everything back to on-premises infrastructure. Yet a move back to on-premises and self-hosted deployments often means sacrificing the operational, cost and scalability benefits that have made SaaS models so popular.

The challenge lies in the legacy of SaaS, which emerged long ago at a time when the world needed relief from self-hosting, but in so doing, SaaS introduced its own set of tradeoffs.

We have noted the challenge of data sovereignty, but SaaS solutions can also introduce the risk of vendor lock-in and the loss of visibility and control over sensitive data. And while it used to be easier for organizations to simply mandate that some sensitive applications should remain on premises indefinitely, we are now so addicted to the benefits of fully managed cloud services that it’s hard to imagine a permanent divorce.

The Dawn of BYOC

Thankfully, there is a third path that both balances the tradeoffs between self-hosted and SaaS models, and provides a manageable path to data sovereignty: Bring Your Own Cloud (BYOC).

In a BYOC deployment, an organization’s data remains inside its own virtual private cloud (VPC), while the vendor remotely operates and maintains the infrastructure. This option gives platform engineering teams more visibility and control than a pure SaaS model, while still allowing them to offload the time-consuming and resource-intensive work of managing cluster operations. This model has the added bonus of freeing them to focus on top business opportunities.

These factors — visibility, control and operations — are even more critical when managed services are powering an organization’s real-time data infrastructure. Many infrastructure teams are overwhelmed by the complexity of supporting real-time workloads at scale in the cloud — for example, by maintaining large Kafka clusters in multi-availability zone environments. At the same time, they struggle with data sovereignty issues as data regulations become increasingly onerous. A BYOC approach is ideal for navigating the compliance and regulatory requirements for real-time streaming data infrastructure.

Redpanda Cloud BYOC clusters as an example of a BYOC deployment model. The data plane remains in the customer’s virtual private cloud (VPC). Redpanda’s control plane manages cluster operations.

BYOC: Beyond the Tradeoffs

BYOC balances the benefits and drawbacks of both self-hosted and SaaS models by giving you the control and flexibility of self-hosting without the complexity and risk. With BYOC, you are also able to implement security measures tailored to your specific environment. BYOC frees you from managing the platform on your own infrastructure so you can offload operations, support and maintenance to trusted experts.

Control

BYOC is a fully managed cloud model, but you retain more control than in a traditional SaaS model due to the separation of the control plane, which sits in the vendor’s cloud environment, and the data plane, which sits in your environment. This separation means that even when the vendor’s control plane is down, your system can run as usual and your data is available.

Cost

Cloud providers reward customers for long-term spending by providing committed spend or committed use discounts. The beauty of the BYOC model is that it enables organizations to continue taking advantage of those infrastructure discounts as if they were self-hosting.

Security

Beyond data sovereignty, BYOC also helps organizations comply with data privacy regulations. Leveraging zero trust access control and an isolated protected cluster, BYOC deployments can enforce multiple layers of security, all under the control of the team running the platform. BYOC also helps you to maintain least privilege for critical resources because the vendor’s control plane doesn’t have excessive credentials or permissions.

Which Deployment Option Is Right for You?

Platform engineering teams facing a range of deployment options, while dealing with spiraling cloud costs and service sprawl, now have the additional challenge of tackling data sovereignty. BYOC is a good option for organizations that need the benefits of self-hosting, such as control, observability and governance, without the inherent complexity and risk.

If your organization is embracing real-time data streaming and processing while also struggling with data sovereignty challenges, then BYOC is an option that engineering leaders in your organization should evaluate.

The post A Middle Path for Data Sovereignty: Bring Your Own Cloud appeared first on The New Stack.

Pros and Cons of Cloud Native to Consider Before Adoption

Amanda Mitchell — Tue, 15 Aug 2023 13:26:48 +0000

This is the second of a four-part series. Read Part 1.

Cloud native adoption isn’t something that can be done with a lift-shift migration. There’s much to learn and consider before taking the leap to ensure the cloud native environment can help with business and technical needs. For those who are early in their modernization journeys, this can mean learning the various cloud native terms, benefits, pitfalls and about how cloud native observability is essential to success.

To help, we’ve created a four-part primer around “getting started with cloud native.” These articles are designed to educate and help outline the what and why of cloud native architecture.

The previous article included a definition of cloud native, its connection to DevOps methodology and architectural elements. This article will cover the pros and cons of cloud native adoption and implementation.

Innovation Brings Complexity

A cloud native architecture speeds up application development since a large application can be broken into parts, and every part can be developed in parallel. That brings many benefits. But the complexity of cloud native apps makes it hard to see the relationship between various elements. That makes it harder to maintain performance, security and accuracy or diagnose problems in these areas when they arise.

So, let’s look at both the benefits and challenges of using a cloud native architecture.

Empowering the Modern Business

Applications built using a cloud native architecture offer a faster time to market, more scalable, efficient development and improved reliability. Let’s look at the advantages in greater detail.

Faster Time to Market

A cloud native approach to developing applications speeds development times. The component nature of cloud native apps allows development to be distributed to multiple teams. And the work of these teams can be done independently. Each service owner can work on their component of the app simultaneously. One group is not dependent on another group finishing its part of the app before they can start on their own.

Additionally, cloud native apps allow components to be reused. So rather than creating a new frontend for every new app or a new “buy” capability, existing ones can be used on a new app. Reusing various elements greatly reduces the total amount of code that must be created for each new application.

Change one thing in the code for a monolithic structure, and it affects everything across the board. Microservices are independently deployed and don’t affect other services.

Efficiency

As noted, a cloud native approach lets smaller development teams work in parallel on a larger application. The idea is that a smaller team spends less time managing timetables, in meetings and keeping people up to date, and more time doing what needs to be done.

In such a work environment, these small teams access common company resources. That allows each team to benefit from cultural knowledge acquired over time throughout the organization. And naturally, the teams can work together, benefiting from each other’s best practices.

Scalability and Agility

In a cloud native environment, an organization can readily scale different functional areas of an application as needed. Specifically, running elements of a cloud native application on public clouds builds the capability to dynamically adjust compute, storage and other resources to match usage.

Adjustments can be to accommodate long-term trends or short-term changes. For instance, a retailer having a seasonal sale can increase the capacity of its shopping cart and search services to accommodate the surge in orders. Similarly, a financial institution seeing an increase in fraudulent activity may scale up machine learning fraud detection services.

If you run everything through one monolithic application, it’s hard to manage the massive scale of services and respond to changing market conditions as an application grows.

Reliability and Resiliency

Because cloud native systems are based on loosely coupled, interchangeable components, they are less vulnerable to a larger set of failures compared to the classical monolithic application. If one microservice fails, it rarely causes an application-wide outage, although it could degrade performance or functionality. Similarly, containers are designed to be ephemeral, and the failure of one node will have little to no impact on the operations of the cluster. In short, in cloud native environments, the “blast radius” is much smaller when a component fails. When something fails, a smaller set of services or functions may be affected, rather than the entire application.

Cloud Native Also Comes with Challenges

Competitive benefits notwithstanding, cloud native adoption comes with its own set of challenges. None are insurmountable thanks to modern tooling, but understanding what you’re getting into with microservices and containers will set you up for success on your cloud native journey.

Complexity Can Impede Engineer Productivity

The inherent design of microservices leads to significant complexity. Imagine a microservices architecture featuring thousands of interdependent services — it becomes much more difficult and time-consuming to isolate issues. Even visualizing these services and their connections is challenging, let alone wrapping your head around it. When microservices are so independent of each other, it’s not always easy to manage compatibility and other effects of different versions and workloads.

The infrastructure layer is not any simpler. Kubernetes is notoriously challenging to operate, in part because the ephemeral nature of containers means some may only live for a few seconds or minutes. There are many moving parts in a container orchestration system that all must be configured and maintained correctly.

All told, cloud native complexity places a new burden on engineers who are responsible for performance and reliability.

Unprecedented Observability Data Volume

With cloud native agility comes an explosion of observability data (metrics, logs, traces, events) that can slow down teams while they’re trying to solve customer-facing problems. Cloud native environments, especially as you start scaling them, emit massive amounts of observability data — somewhere between 10 and 100 times more than traditional VM-based environments. Each container emits the same volume of telemetry data as a VM, and scaling containers into the thousands and collecting more and more complex data (higher data cardinality) results in data volume becoming unmanageable.

The post Pros and Cons of Cloud Native to Consider Before Adoption appeared first on The New Stack.

How Quantic Improved Developer Experience, Scalability

Vigyan Kaushik — Mon, 14 Aug 2023 13:24:13 +0000

Quantic is a fast-growing startup that helps businesses streamline their operations by using a full-featured cloud-based point-of-sale (POS) platform, including devices carried around by staff. Initially, our company primarily focused on restaurant operations. We weren’t planning for the technology to become a stand-alone product. However, by 2015, the demand from other restaurants was so high that we decided to roll it out, extending services to retailers, grocery stores, hotels, mini marts, gift shops, car washes and other businesses. As workloads grew and customer portfolios expanded, the business had to navigate a wide variety of different rules, regulations and workflows for the different industries served.

The Challenge

Over time, as our business expanded to hundreds of clients, we outgrew our original NoSQL database. Our application needed to scale beyond what our existing database was able to handle. We also needed to provide customers with real-time data synchronization capabilities, which enable the replication of data across clusters located in different data centers, something our database didn’t support. On top of that, unplanned downtime hurt customer experience, while managing various clusters placed a huge strain on our developers. DevOps teams were forced to deal with complex database management issues rather than focus on software development.

Following a competitive evaluation, Quantic turned to Couchbase Capella database as a service for a scalable and simple, yet powerful, way to keep pace with an ever-expanding number of customers, products and features.

The Solution

Building a database in-house to support the needs of the company was out of the question due to the cost, time and talent requirements. The database, data syncing from cloud to edge and storage costs alone wouldn’t make sense for a business that’s trying to scale. After evaluating various databases, we selected Couchbase Capella on Amazon Web Services (AWS) for its high performance, multidimensional scalability and a flexible NoSQL architecture that developers found familiar and easy to use.

The price performance of Capella coupled with mobile support and developer-friendly features allowed customer applications to be available 24/7, even when network connectivity was down. Capella’s offline synchronization capabilities, paired with the flexibility of JSON and SQL++, ensured applications were always on and always fast. This made Capella an easy choice for the business.

Additionally, Capella simplified decision-making, allowing Quantic’s clients to increase their business by determining when to offer special deals based on historical sales data, all within a single platform. Through Capella’s high-performance indexing, reports can be generated faster, allowing customers to get the data they need when they need it. With Capella, Quantic powers its applications, including tableside order placement and payments, customer management, couponing, QR codes, loyalty programs and fast checkout.

The Results

Once Capella was deployed, Quantic experienced an immediate impact on the business. Our customers need most applications to function in real time. From time tracking to sending orders to the kitchen via kitchen display systems, real-time communication is essential. This requires data to be synced quickly across the organization’s entire architecture. With Capella in our tech stack, the company receives instant updates and is able to provide a seamless end-user experience. Customers have uninterrupted access to data that has been aggregated over short or long periods of time.

The business also noticed that indexing speeds became faster, which made a substantial impact on reporting. And in some instances, query times were cut in half for end users.

What the Future Holds

Quantic continues to grow. The business recently developed a white-label POS platform, allowing other vendors to sell our services as their own. Through the white-label program, anyone, from an independent sales organization to a bank, can provide its customers with a solution that can grow its brand. Since the heavy lifting has been done using Capella, we’re able to expand our POS systems, while simultaneously helping partners expand their brands.

As the company scales and workloads balloon, Couchbase will help Quantic reduce database management efforts so development teams can focus on product enhancements to provide end users with a seamless experience.

Learn more about how Couchbase Capella on AWS enables Quantic to manage and scale the company’s growing workloads while improving the developer experience here. Try Capella on AWS for free here.

The post How Quantic Improved Developer Experience, Scalability appeared first on The New Stack.

Is Jamstack Toast? Some Developers Say Yes, Netlify Says No

Richard MacManus — Wed, 09 Aug 2023 15:49:14 +0000

When Netlify acquired one of its former competitors, Gatsby, in February, I noted that its use of the term “Jamstack” (which it coined in 2016) wasn’t so prominent in its marketing anymore. “Composable architectures” appeared to be the new catchphrase. Fast-forward six more months, and Netlify has just closed The Jamstack Community Discord, according to Jamstack aficionado Brian Rinaldi (who runs an email newsletter called — for now — JAMstacked).

Rinaldi added that Netlify has “largely abandoned” the term, “in favor of a “composable web” term that better aligns with their ambitions around becoming a broader enterprise platform including content (with tools like Netlify Connect).” Another developer who has heavily used Jamstack over the past 7-8 years, Jared White, considers the name “all but dead” now.

So is Jamstack dead or not? To find out from the horse’s mouth, I messaged Netlify CEO Matt Biilmann.

“Very much not dropping the term or declaring the architecture gone!” he wrote back, adding that “the Jamstack architecture has won out to a degree where there’s little distinguishing ‘Modern Web Architecture’ from Jamstack architecture.”

In a tweet, he clarified that “basically all modern web frameworks ended up being built around self standing front-ends talking to API’s and services.”

Paul Scanlon, a developer who works for CockroachDB (and is also a tutorial writer for The New Stack) agrees with Biilmann.

“Jamstack, in terms of the word or definition, might be “dead”, but the principle lives on,” he told me. “Web development prior to Jamstack very much existed with front end and backend being separate things, with developers working on either side of the stack. Jamstack not only merged the technologies to form a collapsed stack, but it meant developers naturally became full stack.”

Whether or not the term “Jamstack” is still relevant, Biilmann admits that the company is re-focusing its marketing efforts.

“So the architecture is more alive than ever and has won out to the degree that for us as a company, we are now more focused on marketing around how to help large enterprises at scale modernizing their web infrastructure, rather than convincing individual web teams to adopt a Jamstack approach,” he said.

The Rise and Plateau of Jamstack

Regardless of whether Jamstack “won,” it’s clear its popularity has plateaued. But why? To answer that, we first have to go back a few years.

I first wrote about Jamstack in July 2020, soon after I joined The New Stack. I interviewed Biilmann about a trend that was at the time styled “JAMstack” — the “JAM” referred to JavaScript, APIs and Markup; the “stack” part referred to cloud computing technologies.

I quickly learned that the acronym itself wasn’t particularly meaningful. It’s not so much the components of JAMstack that make it interesting, I wrote in 2020, “It’s that the approach decouples the frontend of web development from its backend.”

The early promise of JAMstack for developers was that it would make their lives easier, by allowing them to create simple HTML files using a “static-site generator” (like Gatsby or Hugo), call APIs using client-side JavaScript, and deploy using git (typically to CDNs — content delivery networks).

Netlify didn’t do all of this itself (especially the static file part), which is why it wanted to create an ecosystem called JAMstack. But it had a significant footprint in that ecosystem, by enabling developers to access APIs and deploy those static files. As Biilmann himself told me in 2020, “We [Netlify] take all of the complexity of building the deployment pipelines, of running the infrastructure, of managing serverless functions, of all of that, [and] we simply abstract that away from you.”

However, as the years rolled by, the Jamstack ecosystem seemed to increase in complexity — largely due to the ever-increasing popularity of React and its attendant frameworks. As Jared White explained in his post, “JAMstack eventually gave rise to a rebranded “Jamstack” with the major value prop being something rather entirely different: you could now build entire websites out of JavaScript libraries (aka React, or maybe Vue or Angular or Svelte) and JavaScript frameworks (aka Next.js, Gatsby, Nuxt, SvelteKit, etc.).”

So Is Jamstack Dead or Alive?

It’s fair to say that the term “Jamstack” (as it’s now styled) has become rather muddled. As Brian Rinaldi pointed out in his post, “the definition has continued to shift to accommodate new tools, new services and new platform features.” At the beginning of this year, Rinaldi wrote that “Jamstack has become more of a “community” than a set of architectural rules.”

Certainly, Netlify itself isn’t pushing the term as much as it used to. Jamstack now only barely features on Netlify’s homepage, way down the bottom in the form of two legacy menu items (“Jamstack Book” and “Jamstack Fund”). The word “composable,” by contrast, features twice at the very top of the page — including in its new catchphrase, “The future is composable.”

“Composable is a broader term that becomes more relevant when we’re talking to architects at large companies that are not just thinking about the web layer, but how to organize the underlying architecture as well,” Biilmann said when I asked him about the new term.

That’s fair enough, but what do practicing web developers think of Jamstack now? Jared White, for one, is ready to move on. “What Netlify gave us originally was a vision of how to deploy HTML-first websites easily via git commits and pushes, just like Heroku had done for dynamic applications,” he concluded. “All we need now is a modern Netlify/Heroku mashup that’s cheap, stable, and doesn’t need to reinvent the damn wheel every year.”

Paul Scanlon thinks the guiding principles of Jamstack are still relevant, although he sees little use for the term itself. “Does it even matter? I’m a Flash Developer, Flash died a long, long time ago and I’m still here. The guiding principles behind anything that move us forward will always remain. The buzzwords likely won’t.”

For his part, Rinaldi says that “the term seems to be dead but the tools and technologies it encompassed are still very much alive.” He plans to re-brand his JAMstacked newsletter but hasn’t yet decided on a replacement name.

The post Is Jamstack Toast? Some Developers Say Yes, Netlify Says No appeared first on The New Stack.

5 Things to Know Before Adopting Cloud Native

Amanda Mitchell — Tue, 08 Aug 2023 15:13:30 +0000

This is the first of a four-part series.

Cloud native adoption isn’t something that can be done with a lift-and-shift migration. There’s much to learn and consider before taking the leap to ensure the cloud native environment can help with business and technical needs. For those who are early in their modernization journeys, this can mean learning the various cloud native terms, benefits, pitfalls and about how cloud native observability is essential to success.

To help, we’ve created a four-part primer around “getting started with cloud native.” These articles are designed to educate and help outline the what and why of cloud native architecture.

This first article covers the basic elements of cloud native, its differences from legacy architectures and its connection to the DevOps methodology.

A Look at Cloud Native and Its Necessity for Business Today

A reliable cloud native environment is essential for the survival of enterprises today. Moving to a modern microservices and container-based architecture promises speed, efficiency, availability and the ability to innovate faster — key advantages enterprises need to compete in a world where a new generation of born-in-the-cloud companies are luring away customers hungry for new features, fast transactions and always-on service.

Add in economic uncertainty and the competitive stakes for enterprises soar: A simple search delay on an online retailer’s site could lose a loyal customer and coveted revenue to a more innovative and reliable competitor.

With encroaching competition from nimble organizations, an uncertain global economy and savvy, demanding customers, it’s more important than ever to transition to a modern, cloud native technology stack and best practices that can deliver:

A highly available and more reliable service. Cloud native best practices enable you to build a more resilient product and service.
More flexibility and interoperability. Cloud native environments are not only more portable, but they also provide the ability to scale up and down dynamically and on demand.
Speed and more efficiency. Engineers can iterate faster to handle increased customer expectations.

But buyer beware, cloud native is challenging. The benefits of adopting cloud native technologies are impossible to ignore and Gartner predicts that 90% of companies will be cloud native by 2027. But there are also challenges that come with the shift from a traditional to a modern environment: If the transition to cloud native lacks proper planning and tools, enterprises risk unprecedented data volume, increased costs, downtime, reduced engineering productivity, and, yes, customer dissatisfaction.

What Is Cloud Native?

The challenge most organizations face is how to have the flexibility to rapidly develop and deploy new applications to meet fast-changing business requirements. Increasingly, cloud native is the architecture of choice to build and deploy new applications. A cloud native approach offers benefits to both the business and developers.

In contrast to monolithic application development, cloud native applications or services are loosely coupled with explicitly described dependencies. As a result:

Applications and processes run in software containers as isolated units.
Independent services and resources are managed by central orchestration processes to improve resource usage and reduce maintenance costs.
Businesses get a highly dynamic system that is composed of independent processes that work together to provide business value.

Fundamentally, a cloud native architecture makes use of microservices and containers that leverage public or private cloud platforms as the preferred deployment infrastructure.

Microservices provide the loosely coupled application architecture, which enables deployment in highly distributed patterns. Additionally, microservices support a growing ecosystem of solutions that can complement or extend a cloud platform.
Containers are important because developing, deploying and maintaining applications requires a lot of ongoing work. Containers offer a way for processes and applications to be bundled and run. They are portable and easy to scale. They can be used throughout an application’s life cycle, from development to test to production. They also allow large applications to be broken into smaller components and presented to other applications as microservices.
Kubernetes (also called K8s) is the most popular open source platform used to orchestrate containers. Once engineers configure their desired infrastructure state, Kubernetes then uses automation to sync said state to its platform. Organizations can run Kubernetes with containers on bare metal, virtual machines, public cloud, private cloud and hybrid cloud.

The Cloud Native and DevOps Connection

Cloud native is the intersection of two kinds of changes. One is a software and technical architecture around microservices and containers, and the other is an organizational change known as DevOps. DevOps is a practice that breaks down the silos between development teams and central IT operations teams where the engineers who write the software are also responsible for operating it. This is critical in a cloud native era, as distributed systems are so complex the operations must be run by the teams who built them.

With cloud native and DevOps, small teams work on discrete projects, which can easily be rolled up into the composite app. They can work faster without all of the hassles of operating as part of a larger team. Amazon Executive Chairman Jeff Bezos felt that this small team approach was such a benefit he popularized the concept of the two-pizza team, which is the number of people that can be fed by two pizzas. As the theory goes, the smaller the team, the better the collaboration between members. And such collaboration is critical because software releases are done at a much faster pace than ever before.

Together, cloud native and DevOps allow organizations to rapidly create and frequently update applications to meet ever-changing business opportunities. They help cater to stakeholders and a user base that expects (and demands) apps to be high availability, responsive and incorporate the newest technologies as they emerge.

The Monolithic Architecture Had Its Time and Place

We just discussed how a microservices architecture is a structured manner for deploying a collection of distributed yet interdependent services in an organization. They are game-changing compared to some past application development methodologies, allowing development teams to work independently and at a cloud native scale.

In comparison, with a monolithic architecture, all elements of an application are tightly integrated. A simple change to one, say, the need to support a new frontend, requires making that change and then recompiling the entire application. There are typically three advantages to this architecture:

Simple to develop: Many development tools support monolithic application creation.
Simple to deploy: Deploy a single file or directory to your runtime.
Simple to scale: Scaling the application is easily done by running multiple copies behind some sort of load balancer.

The Monolithic Model

The monolithic model is more traditional and certainly has some pros, but it will slow down enterprises needing to scale and compete in a world where the name of the game is fast, reliable, innovative application development. Here are some of the main issues organizations have when using a monolithic model:

Scalability – Individual components aren’t easily scalable.
Flexibility – A monolith is constrained by the technologies already used in the system and is often not portable to new environments (across clouds).
Reliability – Module errors can affect an application’s availability. Module errors can affect an application’s availability.
Deployment – The entire monolith needs to be redeployed when there are changes.
Development speed – Development is more complex and slower when a large, monolithic application is involved.

A Final Word about Cloud Native

If the last few years have taught us anything, it’s that speed and agility are the foundation of success for digitally transformed organizations. Organizations that can meet the rapidly evolving demands of their lines of business, customers and internal users will be able to successfully navigate tough times.

Using a cloud native architecture helps ensure new applications can be created quickly and existing applications can be promptly updated to incorporate new technologies or as requirements change over time.

In the next installment, we’ll be discussing the benefits of cloud native architecture and how it can empower modern business.

The post 5 Things to Know Before Adopting Cloud Native appeared first on The New Stack.

Install Cloud Foundry Korifi on Google Kubernetes Engine

Ram Iyengar — Tue, 01 Aug 2023 17:00:47 +0000

Managed Kubernetes clusters are very popular among software developers.

They look to managed providers for a variety of reasons, the chief being simplified management, improved reliability, and controlling costs.

Of the many providers available, Google Cloud is rather popular among software developers and is a good choice as an infrastructure provider.

Google Kubernetes Engine is the managed Kubernetes service provided by Google Cloud, but developers need more than SKE alone to manage their apps.

Cloud Foundry is a powerful Platform-as-a-Service (PaaS) that can help software developers build, deploy, and manage applications more easily and securely. It installs on any cloud-based infrastructure and transforms it into a multitenant, self-service, and consumable resource. It is built with the goal of helping developers focus on building applications — and not managing infrastructure.

Cloud Foundry was originally built for use on VM-based compute. For over a decade, Cloud Foundry has been successfully implemented for planet-scale workloads. Now, this same abstraction is being made available for Kubernetes-based workloads, too. Cloud Foundry Korifi is an implementation of the Cloud Foundry API built on top of Kubernetes-native custom resources.

Korifi is designed to install on any infrastructure provider. This is greatly simplified by managed Kubernetes offerings that are available, and has been tested additionally on kind, k3s, and other Kubernetes flavors for local development. Apps written in any language or framework can be deployed using Korifi.

This tutorial will show you how to install the Cloud Foundry Korifi on Google Kubernetes Engine.

Prerequisites

Please install the following tools to start.

kubectl
cf CLI5 or greater
Helm
gcloud cli

Installation Steps

The first step is to create a Kubernetes cluster. When using Google Kubernetes Engine, we found that creating a cluster using “Autopilot” does not work because it conflicts with some of the networking configuration required for Korifi. Please choose the “Standard” mode to configure the cluster. You could create the cluster using the command line using the following command:

The given command is a command-line instruction using the Google Cloud SDK (gcloud) tool to create a Google Kubernetes Engine (GKE) cluster with a specific configuration. Next, we will install the following dependencies: cert-Manager, kpack, and Contour.

Cert-Manager is an open source certificate management solution designed specifically for Kubernetes clusters. It can be installed with a single kubectl apply command, with the latest release referenced in the path to the YAML definition.

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.0/cert-manager.yaml

Kpack is an open source project that integrates with Kubernetes to provide a container-native build process. It consumes cloud native Buildpacks to export OCI-compatible containers. Kpack can be installed by using the kubectl apply command, bypassing the yaml containing the declaration of the latest release.

kubectl apply -f https://github.com/pivotal/kpack/releases/download/v0.11.0/release-0.11.0.yaml

kubectl apply -f https://projectcontour.io/quickstart/contour.yaml

kubectl get service envoy -n projectcontour -ojsonpath='{.status.loadBalancer.ingress[0]}'

The output from this command will be an IP address, e.g. {“ip”: “34.31.52.175”}, which will be used at various places as the base domain, suffixed with nip.io.

The installation requires a container registry to function. For this installation, we will be using the Google Artifact registry. In order to access this container registry, a secret will have to be created and configured. The command for creating the registry credentials is as follows:

kubectl --namespace "cf" create secret docker-registry image-registry-credentials
--docker-username="<docker-hub-username>"
--docker-password "<docker-hub-access-token>"

For this installation, the following values will have to be used:

kubectl --namespace "cf" create secret docker-registry image-registry-credentials --docker-server="us-central1-docker.pkg.dev" --docker-username="_json_key" --docker-password "$(awk -v RS= '{$1=$1}1' ~/Downloads/summit-labs-8ff7123608fe.json)"

Once the secret has been created, use the following Helm chart to install Korifi on the GKE cluster.

helm install korifi https://github.com/cloudfoundry/korifi/releases/download/v0.7.1/korifi-0.7.1.tgz --namespace="korifi"  && 
--set=global.generateIngressCertificates=true --set=global.rootNamespace="cf" &&
--set=global.containerRegistrySecret="image-registry-credentials" --set=adminUserName="riyengar@cloudfoundry.org" &&
--set=api.apiServer.url="api.34.31.52.175.nip.io" --set=global.defaultAppDomainName="apps.34.31.52.175.nip.io" &&
--set=global.containerRepositoryPrefix="us-central1-docker.pkg.dev/summit-labs/korifi/korifi-" && 
--set=kpackImageBuilder.builderRepository="us-central1-docker.pkg.dev/summit-labs/korifi/kpack-builder" --wait

Note: We use nip.io as the suffix for the externally available IP address that can reach the cluster. Nip.io is a wildcard DNS provider.

Once installation is completed, use the cf cli to set the API endpoint and log in to the cluster.

cf api https://api.34.31.52.175.nip.io --skip-ssl-validation

cf login

The following commands can be used to test the installation.

cf create-org active
cf target -o active
cf create-space -o active ant
cf target -o active -s ant
cf push mighty-monkey -p ~/sandbox/korifi/tests/smoke/assets/test-node-app/

Where to Begin?

Cloud Foundry Korifi is available as a fully open source project. If you’re looking for a way to:

simplify the deployment experience for your application developers;
attain operational excellence when using Kubernetes clusters.

Then, Korifi is a great tool in your arsenal. You can go through some basic Korifi tutorials on the official Cloud Foundry page, in addition to this one to learn about the best way to get started.

The post Install Cloud Foundry Korifi on Google Kubernetes Engine appeared first on The New Stack.

Need for Speed: Cloud Power Moves Expand AI Supercomputing

Agam Shah — Tue, 01 Aug 2023 10:00:22 +0000

Even Elon Musk and his companies’ billions of procurement dollars can not acquire Nvidia’s latest GPUs for machine learning fast enough. That is how strongly demand for GPUs is outstripping supply.

However, there are some clues on where these rare AI training GPUs, such as the H100 and A100, are showing up — in new cloud-based AI supercomputers that were announced in quick succession within the last month.

Cloud providers are picking up investments in data-center equipment after a relative lull in upgrades during the first half of this year. Hyperscalers are rewiring systems in new ways to meet the processing and power demands of AI applications.

GPUs are a cornerstone of machine learning and are interconnected with high-bandwidth memory, networking, and high-capacity storage. Newer data centers also use data processing units to transport data at an unprecedented pace between systems.

Customers will have to fork out $37,000 to access Nvidia’s latest GPUs in its DGX Cloud, which became generally available in July. That is just the starting price, per instance per month, and it goes up for more memory, storage, and faster GPUs. There are cheaper options — users can access virtual machines with Nvidia’s A100 GPU on Microsoft’s Azure cloud for close to half the price.

Nvidia’s DGX Cloud is deployed in its own data centers in the U.S. and U.K., and in Oracle’s cloud infrastructure. Each node of DGX Cloud has eight H100 or A100 GPUs, 640GB of GPU memory per node, and Nvidia’s proprietary NVLink interconnect.

Nvidia CEO Jensen Huang likens DGX Cloud to an “AI factory” — an assembly line of data that is churned inside a proprietary black box, with the output being usable data. Customers need to only worry about the results and not the hardware and software.

But DGX Cloud already has some competitors on the horizon.

Enter Cerebras

Andrew Feldman, the CEO of AI chip maker Cerebras Systems, is not an Nvidia fan. His company makes the world’s largest chip, called the WSE-2, which can train large-language models with billions of parameters on a single chip. Large Language Models typically require multiple GPUs to train.

Cerebras has been business since 2015, but finally found a commercial adopter of its AI systems in G42, a Middle Eastern cloud provider, which will deploy the hardware in three U.S. data centers by the end of this year. That is a breakthrough for Cerebras, especially with other AI chip makers struggling to even get companies to sample their products.

Cerebras’ CG-1, CG-2, and CG-3 systems will be hooked up to form a large AI supercomputer. Each system will deliver four exaflops of performance. The number of systems deployed will expand to nine by the end of the year, delivering a total of 36 exaflops of performance.

“We support up to 600 billion parameters extensible to 100 trillion, fed by 36,000 AMD Epyc cores,” Feldman said. The Cerebras chips are not GPUs, but specialized AI chips with 850,000 cores, 40GB of on-chip SRAM, and 20PBps of on-chip throughput.

Cerebras’s hardware offers some AI chip variety outside of Nvidia’s GPUs, which currently dominate the market. Cerebras’s AI supercomputers until now were largely experiments at U.S. government labs, but are now set for wider commercial use. Companies including GlaxoSmithKline and TotalEnergies have put the systems through stress tests.

Programming to Nvidia’s GPUs can be complicated, and require thousands of lines of code to fully exploit the processor’s computing power. Feldman said it takes just a few lines of Python code to get training going on Cerebras chips.

“It’s talked about as three-dimensional parallelism. That is where you have to do tensor model parallel and pipeline model parallel. That is where you have all these complicated tricks to break up work and spread it over a large number of GPUs. That is the additional 27,000 lines of code. And that is exactly the code that we don’t need,” Feldman said.

Mojo Dojo for Tesla

Tesla recently announced it had started production of its Dojo supercomputer, which will train video that will ultimately allow the company to deploy autonomous driving systems to its cars. Dojo uses Tesla’s self-developed D1 chip, which has 22.6 teraflops of performance.

During the earnings call, Musk said the company is spending close to $1 billion through the end of 2024 on its Dojo supercomputer.

“We think we may reach in-house neural net training capability of 100 exaflops by the end of next year” with GPUs and Dojo, Musk said.

Musk is a big fan of GPUs for video training to support AI on its electric vehicles. The company collects visual data from cameras in its electric cars, which it then uses to create an AI system that will ultimately serve cars to improve driver safety.

But the GPU shortage is slowing down machine-learning tasks at the company. The company plans to deploy 300,000 Nvidia A100 GPUs by the end of next year.

“We’re using a lot of Nvidia hardware. We’ll continue to — we’ll actually take Nvidia hardware as fast as Nvidia will deliver it to us. Tremendous respect for [CEO Jensen Huang] and Nvidia. They’ve done an incredible job,” Musk said in an earnings call this month.

“Frankly, I don’t know if they could deliver us enough GPUs,” he added.

In late July, Amazon Web Services launched the EC2 P5 instances, which will bring Nvidia’s latest H100 GPUs to the cloud service.

P5 instances are the fastest VMs in AWS’s portfolio, the company representatives said at AWS Summit in New York. P5 will be six times faster than its predecessor, P4, and reduce training costs by up to 40%, company representatives said

The P5 instances will be interlinked into UltraScale clusters, and up to 20,000 GPUs can be interconnected to create a mammoth AI training cluster.

“This enables us to deliver 20 exaflops of aggregate compute capability,” said Swami Sivasubramanian, vice president of database, analytics, and machine learning at AWS, during a keynote at the summit.

Typically training tasks are cut down into smaller parts, which are then shared among GPUs in a large cluster. The processing and response time needs to be synchronized carefully to ensure timely coordination among outputs.

The EC2 P5 instances use a 3200Gbps interconnect to synchronize weights and for quicker training. The faster GPUs and throughput ensures companies can work with larger models, or cut infrastructure costs on the same-sized models.

AWS is taking a different approach to AI than rivals Google Cloud and Microsoft, which are trying to lure companies to use their large-language models, which are respectively called Palm-2 and GPT-4. Software companies are paying to link up to OpenAI’s GPT models via the API model.

Amazon also wants AWS to be a storefront that also dishes out a wide variety of latest AI models, with its cloud service providing the computing horsepower. The company announced access to Anthropic’s Claude 2, and Stability AI’s Stability Diffusion XL 1.0, which compete with transformer models from Google and Microsoft.

“Models are just one part of the equation, you need to have the right infrastructure. You need to provide the right workflow support… the right enterprise security for every little piece of the workflow. That is where we’ll focus on … going forward,” said Vasi Philomin, vice president and general manager for generative AI at Amazon.

The post Need for Speed: Cloud Power Moves Expand AI Supercomputing appeared first on The New Stack.

The Future of the Enterprise Cloud Is Multi-Architecture Infrastructure

Cheryl Hung — Fri, 28 Jul 2023 17:00:16 +0000

After years of steady declines, the costs of running a cloud data center are now soaring, due to various factors such as aging infrastructure, rising energy costs and supply chain issues.

A survey by Uptime Institute showed that enterprise data-center owners are most concerned about rising energy costs and IT hardware costs, respectively. In a recent example, Google announced increases in some of its cloud storage and network egress prices, some of which had previously been free to users.

In short, “cloud-flation” is a new reality, and developers are paying a price by having to do more with less. Organizations need to continuously understand and measure the impact of their compute to balance performance, efficiency, and design flexibility in line with budget and business goals.

What are some steps developers can take to reduce costs?

There are a few pieces of low-hanging fruit, including:

Optimize the allocation of cloud resources by analyzing usage patterns and adjusting the size of instances, storage, and databases to match the workload’s requirements.
Implement auto-scaling mechanisms that dynamically adjust the number of instances based on demand. This ensures resources are provisioned as needed, preventing over-provisioning during low-traffic periods and reducing costs.
Evaluate your data storage and database needs and choose the most cost-effective options. Use tiered storage options to move infrequently accessed data to lower-cost storage tiers.

Many companies have adopted cost monitoring and reporting, creating alerts to notify their teams of sudden spikes or anomalies. One significant movement in this direction is around FinOps, a concatenation of “finance” and “DevOps.” This emerging cloud financial management discipline enables organizations to help teams manage their cloud costs. Another example is Amazon Web Services, with its cost-optimization guide pointing to Arm-based Graviton as a way to improve price performance.

Embrace the New Paradigm

Perhaps the most important piece of fruit to be harvested, however, is to think about computing differently. And this is easier said than done.

Data centers and the cloud grew up on a single, monolithic approach to computing — one size fits all. That worked in the days when workloads were relatively few and very straightforward. But as cloud adoption exploded, so too have the number and types of workloads that users require. A one-size-fits-all environment just isn’t flexible enough for users to be able to run the types of workloads they want in the most effective and cost-efficient manner.

Today, technologies have emerged to overthrow the old paradigm and give developers and cloud providers what they need: flexibility and choice. One of its manifestations is multi-architecture, the ability of a cloud platform or service to support more than one legacy architecture and offer developers the flexibility to choose.

The Liberty of Flexibility and Choice

Flexibility to run workloads on the architecture of your choice is important for organizations for two reasons: better price performance and — a reason that is far downstream from data centers but nevertheless important — laptops and mobile devices.

Price performance often arises when organizations realize that running workloads, like web servers or databases, on Arm could be cost-effective, either for themselves or in response to customer demand. Remember those statistics I shared earlier? They’re a big motivation in this context. And this is why the flexibility to choose the right compute for the right workload is critical.

Arm has a legacy of delivering cost-effective, power-efficient computing solutions for mobile technologies for over 30 years. During those years, other sectors, including infrastructure, have embraced these benefits. Today every major public cloud provider runs Arm in some form for various workloads. For example, 48 of the top 50 AWS customers run on Arm Neoverse-based AWS Graviton. The cost benefits of deploying an Arm-based server compared to a traditional one are significant.

The second motivation relates to laptops, which are increasingly being run by power-efficient Arm processors. Developers using these machines began wanting to develop on Arm all the way from their laptop into the cloud. They’re embracing Arm64 for their production and development environments because it makes it easier to troubleshoot and reproduce bugs locally earlier in the development process. And with Arm-based processors now available in every major cloud, Arm-native developers need a multi-arch aware toolchain to safely deploy their code.

Doing the Work

We see three main steps to adopting a multi-arch infrastructure: inform, optimize and operate:

Inform involves taking an inventory of your entire software stack, including finding the operating systems, images, libraries, frameworks, deployment and testing tools, monitoring solutions, security measures and other components you rely on. Make a comprehensive list and check each item for Arm support. Additionally, identify the most resource-intensive components in terms of compute, as these will be your hotspots for optimization.
Optimize allows you to provision a test Arm environment easily. You can spin it up on a public cloud and proceed to make necessary upgrades, changes and conditional statements to ensure compatibility with different architectures. It is crucial to determine the key metrics you care about and conduct performance testing accordingly. Simultaneously, consider upgrading your CI/CD processes to accommodate more than one architecture. Keep in mind that this stage may require many iterations and that you can start migrating workloads before completing all infrastructure upgrades.
Operate within your chosen environment. For instance, in Kubernetes, decide how you will build your cluster. Consider whether to prioritize migrating control nodes or worker nodes first, or opting for a mixture of both. This decision will depend on your software stack, availability and initial workload choices. Modify your cluster creation scripts accordingly.

With the infrastructure in place, you can proceed to deploy your workloads.

Conclusion

A few years ago, migrating to Arm meant hunting for specific versions of software and libraries that support Arm. This support was limited and uneven. But over time, the ecosystem matured quickly, making the migration easier and ensuring software on Arm “just works.” Some notable vendors, such as Redis, MongoDB and Nginx, have come out with Arm support in recent years. This improved ecosystem support contributes to the ease of migration.

The high-performance, energy-efficient benefits of transitioning away from running only on legacy architectures are such that big companies like Airbnb are undergoing this migration, even though they know it’s a multi-year journey for them. Airbnb knows they have a complex infrastructure that assumes everything is x86, so they need to adjust and test all their systems to ensure compatibility with both x86 and Arm architectures. In Airbnb’s case, the longtime benefits are worth the up-front costs and time investment.

Melanie Cebula, a software engineer at Airbnb, put it this way at a 2022 KubeCon-CloudNativeCon presentation: “So Arm64 is in the cloud. It’s in all the clouds, and it’s cheaper, and that’s why we’re doing the work.”

Further solidifying these points, Forrester’s Total Economic Impact study, including interviews and survey data, revealed a cloud infrastructure cost savings of up to 80%, with 30% to 60% lower upfront infrastructure costs.

The key to accelerating and unlocking even more innovation in the world today is for workloads to run on the best hardware for the user’s price-performance needs. Arm technology continues to push the boundaries of the performance-efficiency balance without the developers having to worry about whether their software is compatible.

The post The Future of the Enterprise Cloud Is Multi-Architecture Infrastructure appeared first on The New Stack.

How Platform Engineering Is Disrupting FinOps

David Williams — Fri, 28 Jul 2023 14:20:54 +0000

The FinOps movement evolved from a key selling point of the public cloud — decentralization.

A generation of technology and engineering leaders lost sleep over the fact that anyone in their organization could procure technology resources against their budget with absolutely no oversight.

This stress created a groundswell that has grown into an industry of its own. Built on a wealth of data from cloud-service providers’ bills, the complex world of pricing models and a need to keep budget under control, the FinOps movement played a pivotal role in the astronomical growth of the public cloud. Engineering leaders could rely on their FinOps tools to dive full-force into the public cloud, giving their teams unprecedented access to technology in the process.

But while FinOps tools continued to advance on the pricing and financial side of the cloud, progress on the operational side has remained elusive.

Billing data is valuable to help determine the right pricing models and understand what your team has consumed, but it does little to prevent the activity that drives up cloud waste in the first place. When it comes down to it, the insights taken from cloud-billing data can only show your teams what they ought not to do. Those hoping to prevent that activity have little recourse other than to hope that it doesn’t happen.

Here are a few ways the platform engineering movement is helping to bridge that gap.

Tracking Real-Time Cloud Costs by Team

A platform approach leverages decentralization to help development teams get more value from the public cloud. Anyone with access to the platform can deploy the testing, staging and production environments they need on demand.

However, by orchestrating and deploying those environments via the platform, this approach allows engineering teams to calculate costs based on the configuration of the cloud services and the duration of the deployments.

The platform approach provides data reflecting activity as it occurs. By tracking real-time deployments, you can understand costs as they accrue and make operational adjustments before receiving the bill.

Prohibiting Oversized or Otherwise Unapproved Cloud Instances

A platform approach to delivering application environments also provides a way to make governance policies part of day-to-day operations.

For years, FinOps teams have struggled to enforce standards for cost-efficient cloud behavior. With configurations decentralized across git repositories and Infrastructure-as-Code tools, FinOps teams have had little recourse to know whether cloud deployments adhered to best practices until they received the cloud bill. And even then, ensuring compliance going forward is an uphill battle.

A platform that orchestrates environments from configurations defined in git gives the FinOps team a mechanism to make up ground in that battle.

Let’s take rightsizing, for example. The practice of identifying and rightsizing oversized cloud instances can bring down costs without disrupting operations.

Over time, however, these changes are subject to drift. As oversized instances creep back in, the FinOps team likely won’t know until the cloud bill arrives, at which point they have the same awkward conversations.

The platform, however, can make rightsizing a requirement to deploy an environment. By setting a policy to prohibit specific instance sizes, the platform can deny any deployment with an oversized cloud instance.

These types of policies can vary in purpose and scope, applying rules to technologies or runtimes and limiting enforcement to individual teams. But without the platform as the focal point for enforcement, governance policies are little more than hopeful requests.

Scheduling Automated Shutdown

Terminating idle resources is another benefit of a platform approach.

Consider, for example, a software-testing team that only works standard business hours, Monday through Friday. Any testing environment left running overnight or over a weekend will incur unnecessary cloud costs, and FinOps data does not give you the tools to shut them down or prevent them from running unnecessarily in the first place.

Since a platform deploys the environments, you can set rules to deploy and terminate the VMs to support the testing team’s workday. Setting a schedule to run VMs from 8 a.m. to 8 p.m. every weekday ensures that testing environments will run when required and shut down when they’re no longer needed.

Automating Consistent Tagging

Even the most well-thought-out tagging strategy is only as strong as the person who applies the tags.

Missing tags, typos and inconsistent capitalization can lead to blind spots in reporting that hold back cost-optimization efforts. How do you right size cloud instances if you don’t know who is deploying them?

Again, the platform can answer this problem. The self-service nature of deployment via a platform provides an opportunity to standardize tagging. Including a required field for tag via a centrally managed picklist eliminates the risk of missing and misspelled tags.

When paired with visibility into real-time deployments, this provides a valuable way to monitor cloud costs with the context needed to intervene.

FinOps is still a pillar of the modern cloud world. When paired with an environment orchestration platform, it will only be more valuable.

The post How Platform Engineering Is Disrupting FinOps appeared first on The New Stack.

The Cloud Is Under Attack. How Do You Secure It?

Heather Joslyn — Fri, 28 Jul 2023 13:24:56 +0000

What’s great about building for and deploying to the cloud? Scale.

“One of the reasons why developers love the cloud so much is because A, it makes things quick and easy,” said Elia Zaitsev, Global CTO at CrowdStrike, in this episode of The New Stack Makers.

“And B, You can scale to crazy levels really quickly — as long as you’ve got a big enough credit card to keep adding on the infrastructure.”

But what also makes building for and deploying to the cloud so dangerous? Scale.

If a developer makes a mistake in building an application and deploys it to the cloud, Zaitsev noted, “Now I’ve just introduced that same mistake a million times over and there may not be any other teams, departments, processes, limiting factors getting in my way that stopped me from doing a bad thing that I didn’t realize.”

No wonder then that attacks focused on the cloud nearly tripled from 2021 to 2022, according to the latest Cloud Risk Report from CrowdStrike.

In this episode of Makers, Zaitsev spoke to TNS host Heather Joslyn about the growing problem of cloud-focused attacks, the challenges involved in protecting against those attacks and some best practices that can help.

The Rise of Cloud Native Attackers

A big challenge in securing the cloud is that it’s so new: Amazon only introduced the first public cloud in 2006, and many organizations, even those that have moved fully to the cloud, are still learning about it.

However, Zaitsev said, the cloud native generation — both within and outside of organizations — is coming of age: ”We’ve got this new generation of adversaries that are coming up, and they totally get the cloud. And they implicitly understand what those risks are.”

For these bad actors, the advantage in attacking a cloud is — you guessed it — scale.

“If there’s an issue in one place, it might be scaled up to many, many systems,” he said. “That’s actually the preferred attack vector, because they know if they find an issue that they’re comfortable exploiting, they can actually much more rapidly achieve their actions or objective, cause damage, make financial gains on their side, cause pain to the end user, etc.”

Cultural issues within an organization — tension between security professionals who are “paid to be paranoid,” as Zaitsev said, and developers who are incentivized to build quickly — can make it harder to protect the cloud.

Another big challenge is that attacks have gotten harder to detect as attackers grow more sophisticated. Stealing credentials is a common way attacks start. Said Zaitsev, “Typically, what they want to do, or the most effective way for them to penetrate an organization is to pretend to be you.”

Hard-Coded Credentials: an ‘Unforced Error’

What are some of the best practices to protect your clouds? Adopting the principle of least privilege should be a cornerstone of your strategy, our podcast guest said, and running regular evaluations of who has access and to what.

“Organizations need to be regularly doing this kind of ongoing hygiene and assessment, looking at all the accounts, the credentials that they’ve created in their environment,” Zaitsev said. “And doing that regular assessment of, is this really what they need? Can I dial it back a little bit and be a bit more secure?”

He also reminded listeners to avoid hard-coding credentials into their systems. “There’s almost never a good reason to do it,” Zaitsev said. “Other than it being quicker and cheaper and lazier. It’s definitely an unforced error that you can avoid.

Listen to the full episode for more on best practices to avoid cloud-focused attacks, including the importance of runtime security and avoiding misconfigurations.

The post The Cloud Is Under Attack. How Do You Secure It? appeared first on The New Stack.

Cloud Optimization: Practical Steps to Lower Your Bills

Aaron Schneider — Thu, 27 Jul 2023 15:30:07 +0000

Cloud computing offers organizations the ability to scale in ways that were impossible only a decade ago, and today thousands of companies are doing just that. However, with great power comes great responsibility.

If companies fail to optimize their cloud systems effectively, they may encounter additional expenses, fail to meet performance expectations and even face security risks.

Let’s look at some practical steps you can use to ensure the efficiency of your cloud environments, recommended approaches for cloud-computing platforms such as Amazon Web Services (AWS), and the various tools and services available to begin lowering the total cost of ownership (TCO) of your cloud infrastructure.

What Is Cloud Optimization?

Cloud optimization involves strategically planning, implementing and overseeing cloud resources to achieve optimal performance, cost efficiency and alignment with business objectives. While it can be a challenging and time-intensive process, it offers substantial advantages for organizations.

Key Elements of Cloud Optimization

Cost Optimization

Cost optimization is the process of identifying and eliminating unnecessary expenditures in cloud usage. This requires an analysis of current spending against actual business and technical requirements, to ensure you are not overspending and to identify avenues for cost reduction. It’s also smart to work with cloud providers that offer discounts for long-term commitments or specific resources and use cloud resources during peak hours to right-size your application.

Performance Optimization

Performance optimization involves guaranteeing that cloud applications meet the desired performance standards. To optimize performance, selecting appropriate cloud resources, configuring them accurately and employing techniques such as caching to enhance performance is crucial. For instance, if an application is consuming a significant amount of memory, switching to a different cloud resource with higher memory capacity might improve its performance.

Security Optimization

Security optimization refers to the adoption of optimal practices and measures to safeguard cloud environments against unauthorized access, data loss and various threats. Achieving this involves using cloud security tools, enforcing security policies and protocols, and educating employees on the best practices for security.

How to Optimize Your Cloud Environment

Optimization is always an iterative process, requiring continual adjustment as time goes on. However, there are many quick wins and strategies that you can implement today to refine your cloud footprint:

Unused virtual machines (VMs), storage and bandwidth can lead to unnecessary expenses. Conducting periodic evaluations of your cloud usage and identifying such underutilized resources can effectively minimize costs. Check your cloud console now. You might just find a couple of VMs sitting there idle, accidentally left behind after the work was done.
Temporary backup resources, such as VMs and storage, are frequently used for storing data and application backups. Automate the deletion process of these temporary backup resources to save money.
Selecting the appropriate tier entails choosing the cloud resource that aligns best with your requirements. For instance, if you anticipate a high volume of traffic and demand, opting for a high-end VM would be suitable. Conversely, for smaller projects, a lower-end VM might suffice.
Automation tools can simplify and automate routine tasks, such as instance provisioning, configuration management and deployment. Automating these processes can reduce the likelihood of errors and save significant amounts of time.
To keep your cloud environment safe from threats, configure firewalls, encrypt data and use strong passwords.
Container technology, including Docker and Kubernetes, can package and deploy applications in a consistent and scalable way. This can improve the portability and flexibility of applications.
Enhance the performance and cost-effectiveness of your database with database configuration, indexing and caching. This optimization process may involve implementing read replicas, caching frequently accessed data and optimizing database queries.
AWS Lambda and Azure Functions, along with other serverless computing technologies, enable code to run without the need to provision or manage servers. Deploying these can improve scalability and lower costs.
Incorporating DevOps practices, such as CI/CD, can streamline the application and service-development process. By adopting these practices, organizations can enhance agility and minimize time to market for their products and services.

Cloud Optimization Tools and Services

Numerous cloud-optimization tools are accessible, and the ideal tool or service for your situation depends upon your distinct needs and requirements.

Amazon Compute Optimizer identifies underutilized resources, finds cost-saving opportunities and improves overall performance.
GCP Cost Management identifies underutilized resources, finds cost-saving opportunities and provides budgeting and alert features.
Azure Advisor offers personalized recommendations based on usage and configuration. It helps identify cost-saving opportunities, enhances application performance and improves the overall cloud environment.
VMware Aria Cost powered by CloudHealth optimizes cloud costs, performance and security by tracking spending, identifying savings and improving workload performance.

Why Cloud Optimization Is Important

Cloud optimization maximizes performance, cost and security by efficiently managing and configuring resources. It saves money, enhances performance and reduces risk through techniques like right-sizing, using cloud native services, automation and monitoring.

Because cloud optimization is an ongoing process that needs to be updated as technology evolves and business needs change, it’s essential to keep up to date with the changing landscape to maximize the benefits of cloud computing and minimize the risks associated with it.

One way to optimize costs is to consolidate tools. Couchbase Capella is a fully managed cloud database that can easily replace SQL, nonrelational, online transaction processing (OLTP) and full-text search systems, allowing you to drastically lower your TCO and cloud spend. Learn how you can optimize your database with Couchbase Capella:

Find out more about Capella and watch the demo video.
Dig into more Couchbase videos on our YouTube channel.
Begin your free trial and see how easy it is to get started with Couchbase.

The post Cloud Optimization: Practical Steps to Lower Your Bills appeared first on The New Stack.

5 Common Developer Self-Service Challenges (and Solutions)

Derek Ashmore — Wed, 26 Jul 2023 17:00:44 +0000

How do you enable developer self-service?

That’s a question that more enterprises are seeking to answer as they embrace developer self-service — or platform engineering, as it’s sometimes called — as a means of maximizing the productivity and job satisfaction of their software engineers. Developer self-service represents the next evolutionary step for organizations that have already embraced DevOps since it helps their development teams to make DevOps practices more efficient.

The most obvious solution for enabling developer self-service might seem to be to deploy what’s known as an internal development platform, or IDP. A growing number of IDPs — such as the open source products Backstage and Atlassian Compass, to name two popular examples — are available to help enterprises make a diverse set of technical solutions available to development teams on demand.

IDPs are certainly one component of an effective developer self-service strategy. However, achieving self-service functionality for development teams is more complicated than simply deploying a self-service platform and calling it a day. Businesses also need to address requirements like maintaining their platforms over the long term, updating them to keep up with evolving cloud technologies, managing quality and more. If you don’t think about these challenges ahead of time, you risk deploying an internal developer platform that falls far short of delivering on its maximum potential value.

To prove the point, I’d like to walk through five key challenges that I’ve seen in the field when working with businesses on developer self-service initiatives, along with tips on how to mitigate them.

The What and Why of Developer Self-Service

Before diving into challenges, let’s look at what developer self-service means and what benefits it brings to both developers and businesses.

Put simply, developer self-service is a model wherein software engineers can create the services and environments they need to be productive, without having to ask or wait on the IT department to set up the solutions. Typically, developers do this using an enterprise self-service platform that delivers a variety of ready-made, officially supported solutions — such as a service that allows developers to spin up a Kubernetes cluster quickly and easily, for example, or to create a CI/CD pipeline using approved tools.

When developers can acquire the resources they need quickly and on demand, they can work more efficiently, without having their workflows bottlenecked by bureaucratic processes. That leads to better job satisfaction, not to mention faster innovation for the business.

Developer self-service also helps to mitigate the risks of “shadow IT,” meaning resources that developers spin up without official permission. If a business allows developer self-service and provides ready-made developer solutions, software engineers will no longer have to spin up bespoke services and environments that may not comply with enterprise IT policies — and that are likely to end up unmanaged and unsecured because the IT department doesn’t know they exist.

In addition, the developer self-service model helps businesses to scale up specialized labor that is often in short supply. For example, many businesses have few Kubernetes experts on staff. By having those experts create Kubernetes solutions that developers can then spin up on demand, the business allows the experts to work more efficiently because the experts won’t need to create a bespoke Kubernetes environment for every developer team or need.

Nor do developers who lack a deep understanding of Kubernetes have to spend extensive amounts of their time trying to create an environment. The same logic applies to network engineers, sysadmins and any other type of expert whose knowledge is in high demand across the business.

In short, developer self-service is a way for developers to innovate faster and better, while simultaneously reducing risks for the business.

Developer Self-Service Platform Challenges

Deciding to provide developers with a self-service platform is one thing. Ensuring that the platform delivers the value it is supposed to provide is another, due to the following challenges.

1. Platforms That Don’t Align with Developer Challenges

Probably the most common challenge of creating a good self-service developer solution is the risk that the solution won’t actually address developer pain points.

This can happen because the people who design and build the self-service platform aren’t always developers — or aren’t representative of the enterprise’s development teams as a whole. As a result, they don’t know what developers really need to be more productive.

The best way to mitigate this risk is to treat your self-service platform like a product and assign a product manager (or several) to it. That approach ensures that someone is responsible for assessing exactly what the platform is supposed to do based on what its “customers” — the developers in the organization — need it to do. Otherwise, you risk building solutions that sound interesting but that your development teams don’t really need.

2. Maintaining Quality

Ensuring that your self-service platform meets quality requirements is another common challenge. Low-quality tools and services, compatibility issues between tools and so on can hamper the self-service experience and undercut the value of a self-service platform. If your developers have to spend time fixing the solutions you give them, then the self-service model doesn’t provide the benefits it’s supposed to.

One way to address this challenge is to create automated tests for self-service solutions. For example, you could write an automated test that spins up one of your platform’s solutions, and then evaluates whether it behaves as it’s supposed to. If you run these tests once a week or so, you’ll be able to detect problems early and ensure that your developers aren’t interrupted by platform bugs or other issues.

More broadly, assigning product managers to the self-service problem can help improve quality. Product managers can interface with developers to collect feedback about issues they are experiencing with the platform, and then oversee the mitigation of those shortcomings.

3. Self-Service Platform Maintenance

Even if your developer self-service platform is of high quality, it’s likely to break down over time if no one is assigned to maintain it. The solutions it includes may change over time, leading to support or compatibility problems that need to be addressed through updates. You may also want to add new solutions to the platform, and that requires a maintenance and change-request process.

Here again, having a product manager in place will help to ensure that the self-service platform receives the maintenance it needs. You’ll also ideally have a development team assigned to the platform so it can make changes when required.

4. Cost Concerns

A well-implemented developer self-service platform will save money by allowing developers to operate more efficiently. However, like any other IT resource, the platform itself costs money to build and run, and if you don’t think strategically about how to optimize for costs, you may find that your self-service solutions break your budget.

The antidote here is to have a cost management and optimization strategy for your self-service platform, just as you would for any other solution with significant cost implications for your business. In other words, it’s important to consider the benefits provided as well as the costs for internal developer platform initiatives.

5. Lack of Engagement

Finally, you may run into the challenge that you create a self-service platform and your developers don’t actually use it because they don’t know about it or understand its value. “Build it, and they will come” rarely works in the enterprise.

Avoid this challenge by ensuring that your self-service platform actually solves developer pain points. Advertising the solutions to developers and making them extremely easy to use through excellent documentation and deployment guides will also go far to encourage developers to make the most of the self-service solutions available to them.

Conclusion: Getting More from Developer Self-Service

Deciding to offer your developers solutions on a self-service basis is the first step in taking development workflows to the next level. But just as adopting a CI/CD pipeline doesn’t automatically guarantee all of the benefits that DevOps is supposed to provide, simply building a self-service platform is no guarantee that your developers will enjoy the productivity benefits of a self-service model.

But when you plan for the challenges of developer self-service, you can get ahead of potential weaknesses and create a self-service solution that delivers maximum value to your development teams.

The post 5 Common Developer Self-Service Challenges (and Solutions) appeared first on The New Stack.

Choosing the Right Database Strategy: On Premises or Cloud?

Aislinn Wright — Mon, 24 Jul 2023 14:04:40 +0000

For modern businesses, the database serves as the core of their tech stack and organization. Hence, selecting the right database software approach is crucial. The choice between on premises and cloud services is a critical decision that can affect agility, reliability, flexibility and cost-efficiency.

As companies strive to stay competitive, many have transitioned from legacy on-premises technology to embrace cloud databases, which have emerged as the future of IT infrastructure. However, the choice between on-premises and cloud-based solutions should not be solely influenced by trends, but rather by a comprehensive understanding of each organization’s specific requirements and benchmarks.

On Premises vs. Cloud: Finding the Right Approach

While on-premises solutions have long been dominant, cloud services have lured businesses away from traditional approaches. The notion of a “cloud first” or “cloud only” future is not entirely straightforward and is likely to remain complex for some time. There are a few major key differentiators between on-premises and cloud-based solutions, making it a critical decision for a company. These differentiators include:

Deployment

With a public cloud provider, the deployment process is relatively straightforward. Businesses can quickly get up and running as the cloud-service provider handles hosting and provides access to their resources. However, businesses relinquish the responsibility of building, deploying and maintaining their own cloud servers. On the other hand, an on-premises approach requires the in-house IT team to take full ownership of deployment, and maintaining every element of the database becomes the organization’s responsibility, including addressing any physical server issues that may arise.

Control

The level of control offered by on-premises database solutions is a compelling reason for some businesses to retain this approach. Complete ownership of servers translates to full control over all stored data, making it particularly valuable for enterprises operating in highly regulated industries. In contrast, cloud storage entails relinquishing some control by entrusting the database architecture to a commercial provider. This may result in limitations on tool integration and expansion of the database management system infrastructure. However, it is important to note that the “managed” approach provided by cloud-service providers is not uniform. By carefully evaluating different cloud data-storage options, businesses can find a solution that aligns with their specific requirements.

Security and Compliance

Control is closely linked to security and compliance considerations. On-premises databases are favored by highly regulated enterprises, such as government agencies or health-care providers, due to the ability to manage all data in-house, providing greater protection against security risks. Similarly, having full control over data makes compliance adherence easier for organizations governed by stringent regulations.

Regardless, on-premises database management requires organizations to assume complete responsibility for addressing breaches and outages without the immediate support offered by cloud providers. Cloud solutions, when chosen wisely, can provide robust security measures and dynamic compliance capabilities. It is crucial to carefully evaluate different cloud providers and their offerings to ensure that security and compliance requirements are met effectively.

Cost

Cost is a pivotal aspect where the cloud has dominated conversations. Cloud enthusiasts often highlight the pay-as-you-go model, eliminating the need to budget for infrastructure maintenance and resource allocation. While the cloud can offer potential cost savings, accurately estimating the costs of a cloud database can be more complex than initially perceived. The cloud is not necessarily cheaper overall, but it offers a different expense structure.

The decision between on-premises and cloud-based database solutions is a complex one that requires careful consideration of deployment, control, security, compliance and cost factors. While the cloud offers agility, scalability and cost-efficiency, on-premises solutions provide organizations with full control and ownership of their data.

Highly regulated industries often prefer on-premises solutions due to security and compliance concerns, but cloud providers have made significant strides in addressing these issues.

Cost considerations also play a crucial role, with the cloud’s pay-as-you-go model offering potential savings but requiring accurate cost estimation. Additionally, the hybrid cloud approach has gained traction, offering a blend of public and private cloud databases for enhanced flexibility. Ultimately, organizations must assess their specific needs and priorities to determine the best database strategy for their business.

About EDB

EDB provides enterprise-grade software and services that enable organizations to harness the full power of Postgres, the world’s leading open source database. EDB provides unmatched Postgres database expertise and enables the same Postgres everywhere, including solutions for hybrid, self-managed private clouds and EDB BigAnimal, a fully managed cloud database-as-a-service. For more information, visit our website.

The post Choosing the Right Database Strategy: On Premises or Cloud? appeared first on The New Stack.

3 Reasons Why Teams Move Away from AWS Lambda

Jonathan Michaux — Tue, 18 Jul 2023 17:00:49 +0000

When Amazon Web Services first introduced Lambda in November 2014, it touted it as a compute service “that runs your code in response to events and automatically manages the compute resources for you, making it easy to build applications that respond quickly to new information.”

It was a big deal because it raised the level of abstraction as high as you could imagine in terms of operationalizing code: write a function and Lambda takes care of the rest.

The consumption-based pricing model was also revolutionary in that you only paid for the amount of compute actually used, and the functions scaled down to zero when unused.

However, the total cost of running functions (when you factor in compute, networking and other AWS services required to trigger and orchestrate the functions) could be higher than the cost of compute on a simpler abstraction like AWS EC2 (if you count active compute cycles only) — it’s the price you pay for the amount of value bundled into the higher level abstraction. It also means less flexibility in terms of what you can actually do from your function’s code and the programming languages available for use.

To explain this to my mother, I told her that it is like the difference between ordering food delivery and cooking yourself: ordering a meal via a delivery app is very convenient, but there is less choice and it is more expensive. Cooking yourself provides you with all the freedom of choice at a cheaper cost, but there is more initial investment required from you to cook your meal. Especially if you want to make pad Thai (pictured above), which requires some uncommon ingredients (at least in Europe), like tamarind paste.

Sometimes, the value of takeaway food is worth the higher cost and reduced choice when compared to the effort required to cook the same meal at home.

Let’s dive into the three main reasons why some teams move away from AWS Lambda to lower-level computing abstractions. Read on until the end for tips on how you can migrate smoothly from Amazon Lambda to functions running on Amazon EKS.

Reason #1: Cost

It is very easy to get started with a service like AWS Lambda. If you are a small team starting a new project, you want to maximize your chances of getting to market quickly and getting feedback early. Lambda lets you ship fast by turning as much capital expenditure into operational expenditure as possible.

But at some point, the higher costs of serverless functions when compared to lower-level computing abstractions like virtual machines or containers can become a problem, particularly when an application starts receiving a lot of traffic.

This topic came into the spotlight recently when a team working on Amazon Prime Video cut costs by 90% by moving from Lambdas to a monolith on EC2. They were largely benefiting from the serverless scaling mechanics, but by moving everything into a monolith they cut down massively on orchestration and data transfer costs.

Reason #2: Focusing on a Single Abstraction

There are many types of workloads that you probably don’t want to run on AWS Lambda. For example, ETL data processing or service orchestrations won’t leverage the scalability benefits provided by Lambda and are likely to hit limits imposed by AWS such as total execution time. When a platform team has to support multiple computing paradigms for their organization’s developers, such as lambdas and containers, it adds complexity to their work.

Lambdas and containers each require different solutions to manage various steps of the software development lifecycle. The way you develop, test, deploy, secure and monitor an AWS Lambda function is very different from how you would do the same for a containerized workload running on a container orchestrator like Amazon’s managed Kubernetes service EKS. What we’re hearing from the TriggerMesh community (you can speak to them directly on Slack) is that operations teams will sometimes prefer to unify their operations on a single abstraction like containers, running on Kubernetes, rather than having to solve the same problems in different ways across multiple abstractions. This has a few other benefits:

It makes the landscape simpler for developers in the organization: a single paradigm for deploying code means a single paradigm to learn and master.
It lets teams capitalize on their Kubernetes expertise and optimize the usage of the resources made available in their clusters.
It creates a more cloud-agnostic, portable way to write business logic, with less lock-in to a specific cloud vendor’s services, which brings us to reason #3.

Of course, not all platform teams have the skills or desire to base all their operations on Kubernetes; some will lean toward simpler systems like ECS or Fargate for example.

Reason #3: Portability

A portable application is one that can run on different platforms with minimal changes to the application code. The “platform” of a cloud-native application is made up of the compute, storage, networking and other managed services used by the application and provided by the underlying cloud platform. Therefore, the portability of a cloud-native application can be defined along two dimensions:

The degree of coupling between the application and the compute engine it is running on. For example, what is the cost of migrating a function from AWS Lambda to Google Cloud Functions?
The degree of coupling between the application and the cloud services it uses. For example, if an application subscribes to notifications for new files on an AWS S3 bucket, how easily can it be ported to ingest similar notifications for new files on a Google Cloud Storage bucket?

Companies are increasingly dealing with multicloud architectures. A recurring reason is that through mergers and acquisitions, companies that may have initially been all-in on one cloud provider find themselves operating software across multiple clouds. Some choose to lean into the multicloud way and maintain a footprint on multiple cloud platforms, while others prefer to migrate all their applications to a single cloud. There is no right answer and each has its pros and cons. But in both cases, portability can bring significant benefits.

If you’re migrating apps from one cloud to another, enabling application portability can allow for gradual and less risky migrations. You change a small number of variables at a time rather than doing a big-bang update. And if you’re committing to multicloud, then creating a certain level of portability means that developers can more easily consume resources, data, and events from different clouds. Without a portability layer, each developer has to reimplement integration logic for each cloud which slows down development and increases cognitive load. DevOps teams are trying to offload these responsibilities to the platform so that application developers can focus on what they do best.

What Are People to Use Instead of AWS Lambda?

The three points discussed in this post raise the question: is there a way to migrate Lambda functions to a more cost-efficient, unified and portable computing platform?

The good news is that there are now many established, open-source alternatives that let you run serverless functions. These often include, to varying degrees, the ability to run function code and trigger those functions with different event sources. Examples of technologies in this space are Knative, OpenFaaS, Apache OpenWhisk and TriggerMesh.

For platform teams with a focus on Kubernetes, a three-part recipe is emerging as a way to migrate away from traditional Lambda functions:

Write Lambdas using AWS’s Custom Lambda Runtimes.
Deploy the functions on Kubernetes with Knative Serving.
Trigger the functions with TriggerMesh.

Because AWS Lambda functions can now be built with containers that use AWS’s Custom Lambda Runtimes, you can actually use those same container images and deploy them anywhere that can run containers.

Knative Serving provides a way to take a containerized service and deploy it to Kubernetes such that it will scale to zero when idle, scale horizontally according to load and become addressable so that other workloads on Kubernetes can route events to it.

Knative Serving can easily be installed on Amazon EKS.

The final piece of the recipe is the triggering mechanism. Although Knative comes with a few triggers out of the box, it doesn’t include triggers for AWS services that you might have been using to trigger your Lambda functions. TriggerMesh is a popular open-source solution to expand the range of triggers for your Knative serverless functions and includes AWS services as sources of events such as SQS and S3. And because TriggerMesh can run natively on Kubernetes, along with your Knative services and other workloads, it can easily pull events into EKS (or other K8s distributions) from external sources so that you can filter, transform and route those events to the services you need to trigger. (Have a look at this guide for an example.)

You might be wondering if Amazon EventBridge could be used to trigger your function on EKS, as it provides similar functionality to TriggerMesh but as a managed solution. But because EventBridge is push-based and typically isn’t running in the same VPC as your EKS cluster, it isn’t easy to push events from EventBridge into EKS to trigger your functions.

Choose the Right Path for Your Organization

As always, the devil is in the details and there is no one-size-fits-all approach to these questions. In this post, we covered three major reasons why some teams are moving away from Lambda, and people have raised others related to security and delayed updates to Lambda runtimes. However, according to industry surveys, Lambda is a thriving project and provides a quick way to get units of business logic running reliably in the cloud.

The post 3 Reasons Why Teams Move Away from AWS Lambda appeared first on The New Stack.

SCARLETEEL Fine-Tunes AWS and Kubernetes Attack Tactics

Steven J. Vaughan-Nichols — Thu, 13 Jul 2023 14:08:55 +0000

With SCARLETEEL, attackers can exploit a vulnerable Kubernetes container and pivot to going after the underlying cloud service account.

Back in February, the Sysdig Threat Research Team discovered a sophisticated cloud attack in the wild, SCARLETEEL, It exploited containerized workloads and leveraged them into AWS privilege attacks. That was bad. It’s gotten worse. Now, Sysdig has found it targeting more advanced platforms, such as AWS Fargate.

Reiterating previous strategies, the group’s recent activities involved compromising AWS accounts by exploiting weak compute services, establishing persistence, and deploying cryptominers to secure financial gain. If unchecked, the group was projected to mine approximately $4,000 per day.

But, wait, there’s more! SCARLETEEL is also in the business of intellectual property theft.

During the recent attack, the group discovered and exploited a loophole in an AWS policy, allowing them to escalate privileges to AdministratorAccess, thereby gaining total control over the targeted account. They have also expanded their focus to Kubernetes, intending to scale up their attacks.

The recent attack brought some new features to the fore. These included:

Scripts capable of detecting Fargate-hosted containers and collecting credentials.
Escalation to Admin status in the victim’s AWS account to start EC2 instances running miners.
Improved tools and techniques to enhance their attack capabilities and evasion techniques.
Exploitation attempts of IMDSv2 to retrieve tokens and AWS credentials.
Multiple changes in C2 domains, leveraging public services for data transmission.
Use of AWS CLI and pacu on exploited containers to increase AWS exploitation.
Use the Kubernetes Penetration Testing tool peirates to exploit Kubernetes further.

SCARLETEEL has also shown a particular fondness for AWS credential theft by exploiting JupyterLab notebook containers deployed in a Kubernetes cluster. This approach involved leveraging several versions of credential-stealing scripts, employing varying techniques and exfiltration endpoints. These scripts hunt for AWS credentials by contacting instance metadata (both IMDSv1 and IMDSv2), in the filesystem, and within Docker containers on the target machine, regardless of their running status.

Interestingly, the exfiltration function employed uses shell built-ins to transmit the Base64 encoded stolen credentials to the C2 IP Address, a stealthier approach that evades tools that typically monitor curl and wget.

By manipulating the “–endpoint-url” option, the group also redirects API requests away from default AWS services endpoints, preventing these requests from appearing in the victim’s CloudTrail. Given the opportunity, it will download and run Mirai Botnet Pandora, a Distributed Denial of Service (DDoS) malware program,

After collecting the AWS keys, SCARLETEEL automated reconnaissance in the victim’s AWS environment. A misstep in the victim’s user naming convention allowed the attackers to bypass a policy that would have otherwise prevented access key creation for admin users.

Once admin access was secured, SCARLETEEL focused on persistence, creating new users and access keys for all users in the account. With admin access, the group then deployed 42 instances of c5.metal/r5a.4xlarge for cryptomining.

Although the noisy launch of excessive instances led to the attacker’s discovery, the assault did not stop there. The attacker turned to other new or compromised accounts, attempting to steal secrets or update SSH keys to create new instances. In the event, the lack of privileges thwarted further progression.

Still, this is a disturbing attack. “The combination of automation and manual review of the collected data makes this attacker a more dangerous threat,” the report author, Alessandro Brucato, Sysdig Threat Research Engineer. Pointed out. “It isn’t just nuisance malware, like a crypto miner is often thought of, as they are looking at as much of the target environment as they can.”

The SCARLETEEL operation’s continued activity underscores the need for multiple defensive layers, including runtime threat detection, response, Vulnerability Management, cloud security posture management (CSPM), and cloud infrastructure entitlement management (CIEM). The absence of these layers could expose organizations to significant financial risks and data theft. To deal with attackers like SCARLETEEL, it’s all hands and tools on deck.

The post SCARLETEEL Fine-Tunes AWS and Kubernetes Attack Tactics appeared first on The New Stack.

Generative AI Cloud Platforms: AWS, Azure, or Google?

Janakiram MSV — Fri, 30 Jun 2023 16:00:15 +0000

With the rise of generative AI, the top hyperscalers — Amazon Web Services, Google, and Microsoft — are engaging in yet another round of intense competitive battles.

Generative AI needs massive computing power and large datasets, which makes the public cloud an ideal platform choice. From offering the foundation models as a service to training and fine-tuning generative AI models, public cloud providers are in a race to attract the developer community and the enterprise.

This article analyzes the evolving strategies of Amazon, Google, and Microsoft in the generative AI segment. The table below summarizes the current state of GenAI services offered by the key public cloud providers:

Amazon Web Services: Betting Big on Amazon Bedrock and Amazon Titan

Compared to its key competitors, AWS is late to the generative AI party. But they are quickly catching up.

When it comes to generative AI, there are three key services in which AWS is investing — Amazon Sagemaker JumpStart, Amazon Bedrock, and Amazon Titan.

Amazon SageMaker JumpStart is an environment to access, customize, and deploy ML models. AWS recently added support for foundation models, enabling customers to consume and fine-tune some of the most popular open source models. Through the partnership with Hugging Face, AWS made it easy to perform inference or fine-tune an existing model from a catalog of curated open source models. This is a quick approach to bringing generative AI capabilities to SageMaker.

In private preview, AWS revealed Amazon Bedrock to be a serverless environment or platform to consume foundation models through an API. Though AWS hasn’t shared many details, it does look like a competitive offering compared to Azure OpenAI. Customers would be able to access secure endpoints that are exposed through the private subnet of the VPC.

Amazon has partnered with GenAI startups such as AI21Labs, Anthropic, and Stability.ai to offer text and image-based foundation models through the Amazon Bedrock API.

Amazon Titan is a collection of home-grown foundation models built by its own researchers and internal teams. Titan is expected to bring some of the models that power services such as Alexa, CodeWhisperer, Polly, Rekognition, and other AI services.

I expect Amazon to launch commercial foundation models for code completion, word completion, chat completion, embeddings, translation, and image generation. These models would be exposed through Amazon Bedrock for consumption and fine-tuning.

Amazon may also launch a dedicated vector database as a service under the Amazon RDS or Aurora family of products. For now, it supports pgvector, a PostgreSQL extension for performing similarity searches on word embeddings available through Amazon RDS.

Google Cloud: Built on the Foundations of PaLM

A plethora of GenAI-related announcements dominated Google I/O 2023. Generative AI is important for Google, not just for its cloud business but also for its search and enterprise businesses based on Google Workspace.

Google has invested in four foundation models: Codey, Chirp, PaLM, and Imagen. These models are available through Vertex AI for Google Cloud customers to consume and fine-tune with custom datasets. The model garden available through Vertex AI has open source and third-party foundation models. Google has also launched a playground (GenAI Studio) and no-code tools (Gen App Builder) for building apps based on GenAI.

Extending the power of LLM models to DevOps, Google has also integrated the PaLM 2 API with Google Cloud Console, Google Cloud Shell, and Google Cloud Workstations to add an assistant to accelerate operations. This capability is available through Duet AI for Google Cloud.

A native vector database is missing in Google’s GenAI portfolio. It should add the ability to store and search vectors in BigQuery and BigQuery Omni. For now, customers will have to rely on the pgvector extension added to Cloud SQL or use a third-party vector database such as Pinecone.

For a detailed review of Google’s generative AI strategy, read my deep dive analysis published at The New Stack.

Microsoft Azure: Making the Most of Its OpenAI Investment

With an exclusive partnership with OpenAI, Microsoft is ahead of its competitors in the generative AI game. Azure OpenAI is one of the mature and proven GenAI platforms available in the public cloud.

Azure OpenAI brings most of the foundation models (excluding Whisper) from OpenAI to the cloud. Available through the same API and client libraries, customers can quickly consume engines such as text-davinci-003 and gpt-35-turbo on Azure. Since they are launched within an existing subscription and optionally a private virtual network, customers benefit from security and privacy for their data.

Microsoft has integrated foundation models with Azure ML, a managed ML platform as a service. Customers can use familiar tools and libraries to consume and fine-tune the foundation models.

Microsoft has also invested in an open source project named the Semantic Kernel, which aims to bring LLM orchestration, such as prompt engineering and augmentation, to C# and Python developers. It’s similar to LangChain, a popular open source library to interact with LLMs.

When it comes to vector databases, Microsoft has extended Azure Cosmos DB and Azure Cache for Redis Enterprise to support semantic search.

The post Generative AI Cloud Platforms: AWS, Azure, or Google? appeared first on The New Stack.

Microsoft Adopts OpenInfra Kata Containers Security on Azure

Steven J. Vaughan-Nichols — Thu, 29 Jun 2023 17:14:18 +0000

Everyone wants more security for their cloud processes and data. So, to deliver this, Microsoft announced at the recent OpenInfra Summit that it’s closer to delivering it to its Azure customers. The means? Confidential containers on Azure Kubernetes Service (AKS) within open source Kata Containers. This development aims to strengthen cloud security and offer enhanced protection for sensitive data and applications.

Kata Containers provide a secure container runtime with lightweight VMs. These feel and act like containers but come with VM’s stronger workload isolation. It relies on AMD SVM and Intel VT-x CPU-based virtualization technology for this extra level of protection.

Azure’s Implementation

In Azure’s implementation, Azure leverages AMD’s SEV-SNP hardware-backed Trusted Execution Environments (TEEs) to provide confidential Kara Containers. These offer integrity for code and data in use, protect data in memory from Azure operators, and enable remote cryptographic verification through attestation. And with all this, existing unmodified applications can continue to run seamlessly on these containers.

To achieve this level of isolation, similar to application enclaves and enhance protection from VM administrators, these containers run in dedicated “child virtual machines (VMs)” on each pod. Each container possesses its own memory encryption key with AMD SEV-SNP protections, and its lifecycle is associated with the lifecycle of the confidential Kubernetes pod.

By running Kubernetes pods with this level of isolation, using nested virtualization, customers benefit from application isolation from the parent VM and the tenant OS admin, while still enjoying the ability to run any Linux Open Container Initiative (OCI)-compliant container natively.

Kata and AKS

Michael Withrow, Microsoft’s AKS Product Manager, explained that not only had customers been demanding more security — that goes without saying these days — they’d specifically been asking for Kata. This OpenInfra Foundation technology has been getting a reputation for being easy to work with, easy to implement, and extremely secure.

In the field, this marriage of Kata and AKS can be used for workload isolation from a shared host, untrusted container isolation, aka sandboxing, and multi-tenancy with shared clusters. Practically speaking, Microsoft sees large markets for this in consumer: banking, healthcare, the public sector, and the defense markets.

While this isn’t quite ready for production yet, it’s now in public preview. Microsoft hopes to have it available for customers’ commercial use within the next few months. I expect many users to flock to it once it opens up for business.

The post Microsoft Adopts OpenInfra Kata Containers Security on Azure appeared first on The New Stack.