Security – ACS

February 7, 2025February 14, 2025

GenAI Blog Series #5 – Implementing Security (oh so many levels)

A lot of people just jump right in and start coding a Python notebook that makes some calls to a model endpoint. They will typically pass in some simple prompt, a little bit of data, and do a model request. They will likely get back some cool response and immediately get excited about what they just did. Next, they will run off to show their boss or team and say “hey, look how cool this is and what we can do”. The excitement builds and the company realizes the potential for the use case.

So management then asks, “How do we get this to production and will it be secure?”. Because by the way, the data the data scientist passed in, is actually corporate top secret.

Oh boy, so that little notebook now needs a whole platform built for it and it has to be ultra secure, so where do we start?

Whether you choose to built it or buy it, security should be the #1 thing on your mind. There are so many levels that security can apply too in the system. But not only the system itself, but the users that access the GenAI system. They have to be monitored and secured. If they get compromised, then the attacker has full access to any GenAI agents the user has access too! This is where the E5 licenses from Microsoft and all the security it enables actually start to make a lot of sense!

You are probably thinking, great, another thing to secure. And you’d be right, as the last thing you need is another attack vector that people can take advantage of to cause privacy breaches, legal exposures and lawsuits, or corporate blackmail events.

UIs (User)

First thing, you are going to need a UI or API. Let’s assume the application will need a chat UI. The UI will need to be secured with some kind of IdP (Authentication). Likely Microsoft Entra, Amazon IAM, Okta/Auth0, etc. Being that most frameworks these days provider authentication provider frameworks, shouldn’t be a big task here, but you will need to be able to pass that user credential to the rest of the layers as they will need to know what the user can actually see (Authorization).

Whatever system (remember Buy It/Build it?) you decided to use, should be able to support external IdP integration, if it doesn’t, then I’d say its a no go. After all, you may just want to allow external partners to use the system and charge them for it! We’ll revisit this in the Return on Investment post later!

And lastly, if you are going to utilize this platform for all your employees, you will likely need to consider all the accessibility laws and ensure that whatever you are using, meets the needs of your users.

UIs (Management)

How will you add new agents? Will you let end users create them? When you create an agent, what will the agent be able to use in terms of tools, models, features? Ultimately, this boils down to what can the “user” see in terms of the tools, models and features. How will this work?

APIs

If the system doesn’t need a UI, and all you need is to integrate it with your current applications via an API, then you can skip the User UI requirement and go for the Core and Orchestration layers. If you don’t need to keep track of things like chat history and token burn, then you can probably skip those layers and just go directly to the orchestration\workflow (similar to hosting your agent/tools/workflow in promptflow and Azure Machine Learning style). But you’ll still need to be able to authenticate and authorize the user/application.

Agents

Agents are at the top of the food chain. They provide the template for what will occur when a user/app makes a request. They define the type of workflow/orchestration, the tools, its prompt, etc.

Because an agent will have configuration of tools which then have access to various data sources, its very important that you set the authorization at the agent level properly. One of the most common mistakes people make (especially in Microsoft Copilot), is indexing a bunch of data that is corporate top secret. They then expose that data via an agent without any permissions on it. That then allows anyone that can login to the UI/API to gain access to that data.

Having the ability to lockdown your agents based on the tools and data it has access too, is a vital feature for any GenAI platform.

Tools

Most people build an agent, give a prompt, point it to a model…and that’s it. Simple. However, those of us that have been around a while know that’s not how it works anymore.

You need much more advanced agent types than just a simple knowledge management agent (an agent that points to a vector database).

People want AI solutions that *do things*.

They want analytical agents that take data from a database (using model generated dynamic SQL) and combine that with other data from some other source(s) and add that to a complex prompt that is then fed into the model to get some advanced completion.
They want agents that execute action/functions against external systems.
They want agents that can perform complex workflows that Plan/Replan that run for hours/days.

Agents should be able to be built with plug and play workflows, plug and play tools that implement much more advanced patterns. And by the way, tools have tools.

Models

Agents should be able to plug and play the models they use. This allows for moving to the next model version or completely new model seamlessly and as a result, hopefully improves your latency and accuracy.

Additionally, creating multiple agents with different models allows for some great A+B testing.

As discussing in the “Train vs Mainstream models” post, there are many models out there that can be used for your agents and tools. Having the ability to add in any model from any external platform and then have an agent or tool use that model is pretty powerful.

When creating models, they can have various properties to them, such as temperature, top_p, top_k, etc. When creating a model, your UI should allow for the dynamic adding of model properties that the agent or tool knows how to pass to the model.

This is not an easy feature to implement, but something you should consider when going at it on your own.

If you fine tune the models, or use models built on your own data, it becomes even more important to secure the access to the model because it now has your corporate confidential information integrated into. A compromise of an agent that uses the model can lead to corporate data leakage.

End Points

Models can be hosted in any number of places. The same model can be hosted in Azure, AWS and GCP. End points can be defined to specify where an agent or tool can make the call to a specific model. These endpoints should also be locked down. You don’t want someone adding a very expensive model to their agent and then exposing it to every person in your company and thusly getting a $100K bill that month.

Not only are their endpoints for your models, but if your GenAI system enables external application integration through API calls, those also need to be secured. This can be done through Managed Identities, API Keys, Certificates, etc.

Now that your applications are integrated, they have access to your model. Anyone that has access to the application, now has access to the model. What if the application is compromised and starts to send a high number of requests to your GenAI platform? How will you monitor and control this?

The layers are starting to add up! End to end zero trust security is looking even more complicated.

How are you passing the user credentials from the app to the platform to the model? What is multiple IdPs are involved (Entra to AWS to GCP anyone)? This is not an easy task.

Chat Sessions

Users typically are owners of there own chat sessions. Some platforms have started to toss around the idea of “sharing” chat sessions. This presents some interesting challenges, but is a very cool idea. Microsoft Security Copilot allows for this, but will put the thread into a read-only state after you have performed the operation.

Attachments

Users typically want to pass in attachments for the Agent and Tools to work on. This is very common for people that utilize the OpenAI Assistants API. Attachments must be stored and tracked and be available to be referenced as part of the chat history. In addition, any generated files from the model, must also be saved and tracked.

These files/attachments, should not be visible to other users, therefore, security should be defined such that only that particular chat session user has access to the chat and the attachments as part of the chat.

This gets even more complicated when you decided you want to be able to “share” chat sessions with other people as noted above.

Vectorization

Knowledge management agents are a common first step for most companies tipping their toes in the GenAI waters. These typically require you to vectorize some documents. Organizations have varying levels of requirements when exploring these paths. Some may have only a handle full of documents, others many have several 1000s or terabytes of data they want vectorized.

A data\vectorization pipeline typically is broken into four main steps:

Content Source
- What content sources does the platform support? (SharePoint/M365, Datalake, Blob Storage, Snowflake, etc).
- Does the system support pulling Access Control Lists (ACLs) from the source?
Text Partitioning
- Once you download the data from the content source, you have to break it apart. There are many different ways to do this for different file types (remember iFilters). PDF files is a common one. Chunking and overlap are common parameters that have to be experimented with.
Text Embedding
- Once you have the chunks, you need to embed these chucks using some Embedding model. The most popular option the past two years has been text-ada-002. This allows for 1538 vectors which is pretty decent. However, if you see where this model sits in the ranking of embedding models today, its somewhere in position #75.
- At some point, you won’t want to utilize these old models anymore and you will want to migrate to a new model. This would mean a full re-vectorization of your content.
Indexing
- So you have the embeddings, where are you going to put them?
- Cosmos DB? Azure AI Search? Pine? PostgreSQL?
- This step will save your embeddings to the target store.

Content Sources

Although this was covered above in Vectorization, it is important to note that most platforms allow you to ingest/vectorize any data you want, but very few will allow you to bring in the ACLs. This presents an important security issue. Any knowledge management agent that is pointed at a vectorized datastore that had its ACLs ripped off, know falls back to the security on agent itself. This was pointed out above, but worth noting again as it presents a pretty big security hole when you go live without security in place.

System Data / Reporting

If you choose to store your chat sessions and agent completions, it will likely go into some kind of data store. Stakeholder users will want to be able to gain access to basic reporting capabilities such as:

Request per User
Tokens per User
Errors
Prompt and Completion token usage
Charge back (cost centers)
CPU, Memory and Network loads on the compute layer
Latency and Requests for your models

Power BI is typically asked for, but because of some of the limitations of the product, it doesn’t end up being a viable option without a lot of extra work.

So this typically falls back to creating some kind of customized reporting via Python notebooks. Most authentication should be done via Entra or IAM based identities, not APIKeys (Zero trust).

System Access (Azure resources)

In addition to the data plane, you need to consider the control plane. This is the access to the compute and other resources that host the various layers. When issues start to pop up, someone will need to be able to login to the containers/pods and look at the logs, as well as things like Application Insights and Log Analytics.

You also need to prepare for things like upgrades, so who will be responsible for gaining access to the AKS cluster and storage resources to do those container and scripted upgrades.

Summary

Still want to build your own GenAI from scratch? Security is an important part of the design, but most people overlook what that really means. There is a lot to consider across all the layers that can cause some serious concerns if not implemented correctly.

Its important to do a very detailed review of how users and applications access the system and how the authentication and authorization flows through all layers. Finding gaps where a malicious user/app can take advantage of weak security should be a top priority and either you fill those gaps, or determine mitigation techniques.

The last thing you need is to have to deal with a corporate data leak, compliance issue and any negative hits to your organizations reputation.

Contact

Need help getting your GenAI project started and/or over the finish line? Ping me, always happy to help!

Email: givenscj@hotmail.com
Twitter: @givenscj
LinkedIn: http://linkedin.com/in/givenscj

GenAI Blog Series

September 30, 2021

Splunk to Sentinel Migration – Part VI – Users and Permissions

So the last few blogs posts have been really exciting and enlightening with a clear path for migration, unfortunately, we hit the part where things get a bit bleak in the migration path.

Users and permissions are very rich and granular in Splunk. The concept of namespaces are innovative and very useful when you have to carve out sub-administration tasks and data targeting.

There is no concept of namespaces in Azure Sentinel. There is no wide-spread ability to “hide” things with permissions in Sentinel. You can see some of the limited actions/roles that are available in the article Permissions in Azure Sentinel. That being said, there are some things that can be done, but they are really not a matching one to one of the security features in Splunk.

What might we want to carve out permission-wise? Here’s a simple list:

Queries
Lookups
Alerts / Incidents
Dashboards
Tables / Schemas

Let’s take a look at each of these.

Query Permissions

Queries are stored in various ways in Azure Sentinel. Since Log Analytics is the backing for Azure Sentinel, most of them will be saved into Log Analytics. When being stored, they can be simply stored directly via Log Analytics or have various extra properties that define them as special for display in Sentinel. Others could be stored as sub-objects attributes.

Basically there is no concept of pointers to a query like in Splunk. Its a copy by value type of thing. Queries can be copied/used across Analytic rules and Hunting queries. They can also be saved in Log Analytics for re-use but only at the Log Analytics level. Queries saved as functions can be called at higher levels such as Sentinel.

Hunting queries and Analytics Rules can’t be targeted at a user or group level. If you have access to Sentinel, you have access to all the hunting queries. If you have queries that are targeted at a user / namespace in Splunk, they will become global in a single instance of Azure Sentinel. And since Azure Sentinel can’t be pointed at multiple log analytics workspaces, you can’t really define things at a Azure Sentinel level that will then span across all Log Analytics workspaces.

At a Log Analytics level, you can save queries to the workspace. Again, there is no user / group targeting here. If you have access to the Log Analytics workspace, you have access to all saved queries.

Lookup Permissions

As you read about in a previous blog in this series, there are several ways to create lookups. Some of these can content sensitive data so having a way to lock them down is going to be important. Remember, the three types of lookups are:

Log Analytics Table: As you will see later, log analytics tables can be targeted using custom roles.
Watchlists: Global for the Sentinel instance, everyone can see them.
Azure Storage: Using Azure Storage means you would have to have knowledge of the file location and consequently access to the file (via SAS). Problem is, it is easy to look across all queries and be able to find any references to these external files.

The only real option you have for lockdown for lookups is via custom Log Analytics Table imports.

Alerts / Incidents

If you have access to the Azure Sentinel instance, you will see all incidents. There is currently no way to filter or hide incidents in Sentinel based on tagging or any other means. A SOC analyst will be able to see all alerts and incidents.

Dashboards

Since Dashboards are an Azure Resource that show up in the Resource Group, you can set IAM on them such that individuals won’t even be able to see them when they navigate to the Azure Portal. This simply prevents them from seeing a pre-created dashboard. If they were to figure out what a query was that showed data they shouldn’t see and build their own dashboard (provided they have Contribute access to Azure resource group), they could gain access to data they should probably not be looking at.

Tables / Schemas

Tables and data are stored in Log Analytics. As such, if you want to hide tables, you will need to utilize Azure Custom Roles with predefined table sets. You can reference how to this in the article Manage access to log data and workspaces in Azure Monitor. Doing this manually could be a months long process just by itself.

The way to make this work, is to enumerate the namespace in Splunk and then create custom roles for each of the namespaces and the tables they have access too. This will only allow you to limit access to the underlying tables, not any other elements in Azure Sentinel. However, by the nature of its function, it would prevent a person from being able to execute Hunting queries or to view Dashboards that reference those tables since they don’t have access to them.

Users

Users can be exported from the Splunk instance. If they have their email property setup and they exist in the target Azure AD (which could be automated to provision in the case they are not), then they can be added with the basic permissions of Owner, Contributor and Reader in Log Analytics or Sentinel. Anything customized will need to be via custom roles after the user has been created.

If you do go down the custom roles based on tables (which are likely based on namespaces), you will need a mapping file from the Splunk namespace to the role name in Sentinel.

Resolving the Permission problem.

As you can see, there really is no real high-level concept of permissions in Azure Sentinel and Log Analytics. The lowest level you can set permissions is in the Log Analytics workspace via IAM/Custom roles for tables and then hope everything works right in the higher layers.

Logging Queries

If you choose to migrate without considering any permissions during the migration, it is probably going to be a good idea to log and monitor any queries that are being sent to the Log Analytics workspace (directly or via Sentinel). If there are very sensitive tables contained in the workspace and folks are running queries that should not be doing so, you should be alerted to that (from another workspace/Sentinel?)

So, using Sentinel to monitor Sentinel becomes a thing. Isn’t recursion great?

Summary

Net net…Azure Sentinel is missing some pretty substantial permission based features that you enjoy in other SIEMs. It is up to you to determine if the functionality you are gaining, out weighs the short term drawbacks of losing out on these important security features.

Splunk to Sentinel Blog Series

References:

September 15, 2021

Splunk to Sentinel Migration – Part V – Reports and Dashboards

Now that we have the basics in place, its time to approach some of the harder topics that one would run into during a migration. The first difficult thing falls into the category of reports and dashboards.

This is where things become very much like migrating a non-SharePoint CMS to Sharepoint (which I have done many to many instances of with a high performance tool I created called PowerStream).

Splunk Dashboards are very much like web part pages in SharePoint and ASP.NET. They are made into various sections and placed into an XML/HTML file. On the other side of the equation, you have Workbooks in Azure Sentinel / Log Analytics / Azure Monitor. They are also similar to web part pages with sections.

Sections tend to have variables and queries tied to them and then some kind of context that tells the section how to render. This exists on both sides so a mapping has to be done between source and target.

Splunk Sections

In Splunk, you have the following sections and types:

When searching for these sections, you will look at each of the xml files that are exported for each dashboard. Inside these xml files you will find a s:key element with the name value of “eai:data”. This contains the actual form that will be displayed as part of the dashboard to users. Inside this form are rows and panels:

Inside these panels will be the sections that are of most importance. Things like HTML text and the search queries will be lurking:

As you can see above, the search query is embedded along with the filters and the options that need to be passed to it. These items need to be extracted and then converted to their Azure equivalents.

Sentinel Sections

In order to do the mapping to Azure you will need to understand how to create the workbooks and then upload them properly so they display in Azure Sentinel. In Azure workbooks, they are defined by JSON and have up to 11 different types of “items”. Here are some examples:

Type 1: KqlItem- HTML\Text
Type 3: KqlItem – Query
Type 9: KqlParameterItem – Parameter
Type 10: MetricsItem – Metric from Log Analytics
Type 11: LinkItem – A Url/Link

Again, as part of the migration, you must take the extracted items from the Splunk dashboard and convert to the Workbook version. Once you have this figured out, you now need to upload the workbook.

Uploading Workbooks

Once you have a converted workbook, its time to upload it to Azure Sentinel. Because Sentinel simply builds off a Log Analytics workspace, most things (but definitely not all) are stored in that workspace. To upload, you will make a post to the following rest endpoint:

$url = "/subscriptions/$subscriptionId/resourcegroups/$resourceGroupName/providers/microsoft.operationalinsights/workspaces/$workspaceName"

The post body has some very specific items that must be set in order for the workbook to show up in Sentinel. As it took me some time to determine these parameters, I’ll leave it to use your favorite tool Fiddler to get to the same point 🙂

In the next post, we will explore how permissions work (well they kinda work, as I mentioned in earlier posts its the weakest part of Sentinel at the moment).

Splunk to Sentinel Blog Series

References:

August 26, 2021

Splunk to Sentinel Migration – Part IV – Searches

In Splunk, searches are saved queries. These queries can be used in any number of places, or simply just run when needed. Queries are built in the Splunk Programming Language (SPL).

In Part I of this series we discusses exporting the objects in the Splunk instance. Search queries you will discover are at the center of everything.

We have already discussed Alerts and how they use queries, but as we will see in the next post in this series other things like Dashboards and Reports also use them.

Searches

As we discusses in Part I of this series, we have to export the objects in Splunk as the first step. One of the things you will realize is that “Searches” is an overloaded term. In order to get dashboards and reports out, you have to hit the “searches” REST endpoint.

$url = "$apiUrl/servicesNS/-/search/saved/searches..."

When these object are exported, there are a few properties you will need to interrogate to determine what the search was created for:

Report = {entry}.content.is_scheduled
Dashboard = {entry}.content.isdashboard

We will explore these instances in the next series as they have other higher level components that must be considered, for the purposes of this blog, we simply want to discuss where are the possible places we can “save” these queries in Azure Sentinel.

Sentinel Hunting Queries

Sentinel Analytics (Scheduled Rule)

Log Analytics (Saved Query)

Each of the items above have a different endpoint to create them:

#Hunting Query (Sentinel)
$url = "https://management.azure.com/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.OperationalInsights/workspaces/$workspaceName/savedSearches/$($id)?api-version=2017-04-26-preview"
#Scheduled Rule (Sentinel)
$url = "https://management.azure.com/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.OperationalInsights/workspaces/$workspaceName/providers/Microsoft.SecurityInsights/alertRules/$($ruleId)?api-version=2021-03-01-preview";
#Saved Query (Log Analytics)
$url = "https://management.azure.com/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/microsoft.insights/scheduledqueryrules/$($name)?api-version=2018-04-16";

Each of these endpoints require a POST with very specific properties to tell Sentinel or Log Analytics what to create and what it is for. For example, Azure Sentinel Hunting queries can be targeted at MITRE Framework Tactics, so you’d include those, however, Splunk has no concept of MITRE that you can pull from so you’d have to add that from a reference file.

As was discussed in the first post in this blog series, the above items are the easy part, its the tokenization of the Splunk query and then conversion to KQL that is the hard part.

Splunk to Sentinel Blog Series

References:

August 23, 2021

Splunk to Sentinel Migration – Part III – Lookups, Source Types and Indexes

In the previous two blog series articles, we explored exporting objects, alerts and alert actions. We also looked at the concept of converting SPL to KQL. As part of that conversion process their are some special things that will need to be present in order for the queries to run properly on either the Splunk or the Azure Sentinel side.

These items include lookups, source types and indexes.

Lookups (Splunk)

Lookups are an important part of any SIEM/SOAR system. Typically these make up special lists of users, or more commonly Indicators of Compromise such as IP Addresses, Domain Names, etc.

When it comes to Splunk, you can do lookups in a number of different ways. This includes a file upload with can be either a plaintext CSV file, a gzipped CSV file, or a KMZ/KML file.
The maximum file size that can be uploaded through the browser is 500MB.

You can also do script/command based lookups:

So how do lookups translate into Azure Sentinel? There are several ways:

Log Analytics Table
Watchlists
Azure Storage

Picking between the three options depend on the size and use of the actual lookup table. Note, as we will see in a later post, Sentinel doesn’t have a very robust permissions scheme (yet), so they is something to consider as well. In my opinion, this is probably the largest factor when considering moving to Azure Sentinel from another SIEM.

Log Analytics Table

You can use the Data Collector API to bring in Lookups into Azure Sentinel / Log Analytics.

https://docs.microsoft.com/en-us/azure/azure-monitor/logs/data-collector-api

Once imported, they would then show up like this:

Watchlists

Watchlists are the almost identical mapping of a Lookup in Splunk. As you can see here, I imported the exported CSV file programmatically from Splunk into a Watchlist in Azure Sentinel:

Azure Storage

Kusto Query Language (KQL) allows for the runtime creation of a table based on an external file. In this case, we can export the CSV files from Splunk into Azure Storage and then use KQL to bring it into the queries:

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/externaldata-operator?pivots=azuredataexplorer

Source Types

In addition to lookup data, a vital component is you have to have log data to run your queries on. This log data comes in all kinds of shapes and formats. There are about 150 out of the box Source Types in Splunk.

Source types defined how data is formatted such that the indexers can index it. It is very much an ingest time type of thing.

Indexes

You can consider these like tables in a database. They have a name, rows and columns with type. Splunk stores the data from logs in an “index”. This data is ingested using the source types mentioned above.

In order to determine if a query is going to run, you have to gain access to all the indexes, then export the schema of those indexes. This of course would all go into the tokenizer / compiler and then save the graph of the query for comparison to the “tables” in Azure Sentinel.

As you can imagine, it is not going to be a one to one mapping on the other side. Some things are pretty common, others, not so much.

As we will see in Part VII, Apps have the ability to import new source types, which then in turn create new indexes. As we will see, Apps hold some similarity to “Connectors” in Azure Sentinel.

Ultimately, the goal is to ensure that we have the same Source Type -> Data Source mappings such that the queries are converted properly. In many cases, you’ll need to create a mapping file of the properties on the left side of the equation with the properties on the right side of the migration equation.

Splunk to Sentinel Blog Series

References:

August 17, 2021

Splunk to Sentinel Migration – Part II – Alerts and Alert Actions

Continuing where we left off in Part I, I will explore how to convert Splunk Alerts and Alert Actions into the Sentinel equivalents.

Alerts

Alerts have the same core problem as everything else in Splunk. They are built on SPL queries and must be converted in order for anything meaningful to be accomplished. This means we have to export the alert object, grab the query, convert it, and finally check that it actually executed in Azure Sentinel and then perform the migration.

Alert Actions

Once you have the alert query converted, the Alert very likely has an action tied to it. Splunk provides several out of the box actions:

Migrating these can be a daunting task if you are doing it manually. Not something I’d suggest trying to do in a large Splunk instance. The first step is to export the alert actions into their JSON form:

And then interrogate the various properties for the Alert Action.

Now that you know how to convert the Alert Actions, you need to find all the alerts that are using the actions so you can create them on the Azure side.

Alerts

Alerts are queries that when the query hits a set of targets, will execute one or more Alert Actions. The first step to migrating these is to export them.

Reviewing the exported JSON object, you will find several interesting properties.

The “QualifiedSearch” and “Search” property is the query that the alert is based off of.

This “Actions” property will tell you what actions have been activated for that alert query.

Once you have the list of activated actions, you have to figure out what information you will need to create the corresponding item in Azure Sentinel or Log Analytics.

In the example above, the logevent action has been enabled for the alert. Review the remainder of the properties, you will find a serialization technique Action.{ActionName}.{PropertyName}.{SubPropertyName0}….

Once you get the hang of it, you can then take the exported data and start to build out the migration tasks that need to occur for each Alert Action Type. Here is a basic mapping based on the items above:

Email -> Action Group
LogEvent -> Log to Log Analytics
Lookup -> Insert into Lookup (more to come on lookup migration later in this series)
SMS -> Action Group
Webhook -> Action Group
Script -> Runbook
Custom / App -> Logic App / Runbook

Everything above can be accomplished using Azure Management REST queries (I know as I have done the mappings successfully). Some of these have corresponding Azure CLI or PowerShell commands, however some of them do not. So the best approach is to implement everything using simple REST calls.

Splunk to Sentinel Blog Series

References:

August 11, 2021

Splunk to Sentinel Migration – Part I

I have been a busy bee the last few years. Solliance has kept me focused on a myriad of projects and clients using various technologies such as Machine Learning, AI, Security, IoT and much more.

Recently, I have been focused on building internal training for a whole new breed of security professionals inside Microsoft called Security Cloud Solution Architects (CSAs). As part of their training, they must learn about Azure Defender, Microsoft Defender, Azure Sentinel etc.

One of the topics I added to the training focuses on moving customers to Azure Sentinel. Which means, you would need to retire your old SIEM/SOAR system in lieu of the new one. Some folks would say that this is a easy transition, however, if you have been tasked with a migration project of this kind, you know better.

At a high level, most SIEMs have the same myriad of concepts such as Alerts, Alert actions, Search Queries (along with the tools SIEM language), Dashboard, etc. Each of these present a different set of challenges when migrating them to another platform such as Azure Sentinel. In the following blogs series, I will attempt to run through how I have solved some of these challenges and how I have automated the process.

First, let’s talk about the artifacts that exist in Splunk:

Alerts
Alert Actions
Source Types
Searches
Dashboard
Reports
Namespaces
Permissions

Splunk Query Language to Kusto Query Language

Splunk customers put a lot of work into adding log sources based on source types and then building queries from the data that is ingested and indexed from them. Once they have built the queries, these queries then tend to be used a multiple places such as Alerts, Dashboards and Reports. Splunk uses the Splunk Processing Language (SPL). This is vastly different from the Azure Sentinel Kusto Query Language (KQL). Because it is not easy to map 1:1 from Splunk to Sentinel as it requires you to understand the SPL syntax and the corresponding KQL syntax. For example, take the following query in SPL:

| rest splunk_server=$splunk_server$ /services/server/info
| fields splunk_server version
| join type=outer splunk_server [rest splunk_server=$splunk_server$ /services/server/status/installed-file-integrity
    | fields splunk_server check_ready check_failures.fail]
| eval check_status = case(isnull('check_failures.fail') AND isnotnull(check_ready), "enabled", 'check_failures.fail' == "check_disabled", "disabled", isnull(check_ready), "feature unavailable")
| eval check_ready = if(check_status == "enabled", check_ready, "N/A")
| fields version check_status check_ready
| rename version AS "Splunk version" check_status AS "Check status" check_ready AS "Results ready?"

Would become something like this in KQL:

let table0 = externaldata (data:string) [@"/services/server/status/installed-file-integrity"]| project splunk_server,check_ready,check_failures.fail;
externaldata (data:string) [@"/services/server/info"]| project splunk_server,version
| join kind = outer (table0) on splunk_server
| extend check_status  = case (isnull(check_failures.fail) and isnotnull(check_ready),"enabled",'check_failures.fail' == "check_disabled","disabled",isnull(check_ready),"feature unavailable")
| extend check_ready  = iff (check_status == "enabled",check_ready,"N/A")
| project-rename version = "Splunk version",check_status = "Check status",check_ready = "Results ready?"

If you thought manually transforming queries was a challenge, think through the complexities of automating the process (which the above was done through). It requires writing a tokenizer, then creating a two-pass compiler to build out the language graph. From there, I have to be able to transform the graph of SPL to a graph of KQL, then spit out the KQL query text from the final converted graph. Can anyone say “Dragon Book”?!? I knew you could!

There are some quirks to all this however, I have to be able to know what are strings, variables, table and/or column names…which requires me to interrogate the Splunk indexes for that information. Once I have all the metadata in hand, err memory, I can more easily determine what each thing is in the query when building the graph and subsequent KQL queries.

Some may find this tasks daunting, however, if you do a search you may find a tool called https://uncoder.io. Although it shows great promise when you load the site, it doesn’t really work (broken most of the time) and thusly, I had to build my own as you can see from above.

Apps and Source Types – Oh My.

But…indexes are based on the Source Types that have been added in Splunk. Splunk provides several out of the box source types, but new ones can be added from Apps. Think things like F5, CheckPoint, etc. These source types provide log data that is then indexed and available for querying. Apps have to be mapped from the Splunk based ones to the Azure Sentinel ones, ouch. The indexes created from Splunk Apps have to be transformed to the similar table and column structure in the Azure Sentinel side.

Once you solve these problems, you can easily automate the query transformations from SPL to KQL.

Exporting Artifacts

In order to automate the entire process, you have to export all the objects in the Splunk instance. This can be done using the REST APIs provided by Splunk. It has a very object oriented way of storing these objects which are then serialized to JSON in the REST responses. This provides a very convenient way of dumping the artifacts that would then be used in a migration tool (such as the one I have written).

Next Post – Alerts and Alert Actions

Join me later this month when I write about how to convert Splunk Actions and Alert Actions to Azure Sentinel via automated means!

Splunk to Sentinel Blog Series

References:

June 4, 2014

O365 is not secure

I used to think it was. After all the amazing things you read on the O365 Trust Center, you all but think it is Fort Knox (well, at least one part of it is right Bill Baer?)

Let me go back to that title. "O365 is not secure…once you start adding users". Last week I blogged about a security oversight in O365's partner sharing feature. Basically what this means is that once you accept an invitation from a partner (aka a "user"), you become just another customer in their "customer management" portal. You can read more about this feature here:

http://blogs.office.com/2014/01/28/introducing-the-new-office-365-partner-admin-center/

Do you know who else they have as customers and who is accessing your data and when? No. They don't really have any logs to tell you when a partner is accessing your portal for accountability. Nor do you know who on the backend is working on your O365 deployment project. Just too many unknowns to say it is secure.

O365/Azure Apps

But that's not the focus of this post. A few weeks ago, O365 released the O365 APIs for Exchange (Mail, Contacts, Events and Files). All of this piggybacks on Azure Active Directory for authorization utilizing the Consent Framework. The Azure AD platform allows you to build an application that can make calls to Azure Active Directory or your O365 data. The permissions of the consent framework can allow an application to read the current mail item, all your mails items, your contacts, your calendar…your files.

As a simple example, one of the more impressive abilities is to create a Mail App. You can do this using Visual Studio 2013 and creating an App for Office project. This isn't going to be a developer post so I'm not going to show you any code, but simply talk about the importance of "Mail Apps" and your users.

Once you have a Mail app created, you can deploy it to your O365 instance, or even post it in the Office store. You can find a small set of mail apps in the following link (these have been publicly deployed to the office store, but you can have rogue apps that simply are registered somewhere in Azure AD and users give them ad hoc permissions):

http://office.microsoft.com/en-us/store/apps-for-outlook-FX102804983.aspx?app=outlook.exe

Many of these are poorly designed (for example, the LinkedIn app is a total disaster, the cert is locally generated and not even a public CA). In some cases, you may find the deployment endpoint isn't even available so the app install fails. If you were to install one of these apps, you will find that they simply request access to the "current" mail item. This is the preferred scenario if you were to use these apps. In some cases however, you may find that the app is given "carte blanche" to your entire mail inbox.

So back to users and mail apps. When you are in your Outlook, your users will ultimately find the settings icon and this nice menu item called "Manage Apps":

Clicking on this, they will be taken to the "Exchange" app center. This is where you see administrator installed apps and user installed apps:

This is JUST for Exchange. This has nothing to do with the bigger picture of Azure AD Apps and the consent framework. There are some governance controls implemented in the Exchange app architecture, however, when it comes to Azure AD and your tenant, the story hasn't been written yet. This goes back to the premise of this blog post. All it takes to make your highly secure O365 environment to be unsecure is:

A user
An App
A click

The permissions given to that app can now comprise the integrity of your entire system. When on-premise, you at least had the ability to prevent incoming requests using your router and DMZ. In this instance, O365 doesn't have any of that, so once permissions have been given, that's all folks. So what hacking scenario might come up?

Hacker builds an Azure app, puts your company name and logo on it
Hacker does some social engineering to find out your name, title, work phone number and other information
Hacker asks you to install the app and because it has your logo and company info, your user thinks nothing of it and clicks "trust"
Hacker can now gain access to an access token that can be used to download all your email and your my site files

If you are in defense or government, you know hacking enemy number one is China. For those of you that think that insensitive to say, you haven't watched the traffic logs with the FBI and CIA as it happens, almost on a daily basis. But the US is not a good guy when it comes to that stuff either, so all is fair in love and war.

O365/Azure App Governance

It is my suggestion, that you disable the ability for users to add apps. It simply creates security holes that equate to users writing company checks that can't be cashed. So, with that recommendation, can I provide you the steps to deny a user the ability for users to add apps" in your tenant, *sure*, thanks to some recently surfaced info (see below) [:D] (thanks to @richdizz, @jthake and @Matthias1o1) In terms of Exchange, there are some user roles in Exchange that you can "remove" to prevent users from deploying apps, these roles are enabled by default:

————————
[UPDATE 6-5-2014] After this post the following was brought to my attention from the O365 team. It is a non-UI based way to disable all user consent, thanks to the O365 team for pushing this content out!:

http://blogs.msdn.com/b/exchangedev/archive/2014/06/05/managing-user-consent-for-applications-using-office-365-apis.aspx

————————

Even with the above update, what is still not available to us is an Azure AD UI (to make it more apparent of the risk) that allows me to
disable apps, and/or designate a "safe/white list" of apps that my users *can* utilize in Exchange or Azure AD.

Just for kicks, I have created a UserVoice suggestion that more advanced App Governance features (such as a UI and white list) be added to Azure AD and O365, you can vote for it here:

http://officespdev.uservoice.com/forums/224641-general/suggestions/6010008-create-app-governance-in-o365-azure-ad

My God, What Have You Done?

So, this brings up an interesting point. Last night at the SanSpug.org meeting I announced that we have moved to Auth0.com as our auth mechanism as it provides the most flexibility and telemetry data. One person asked, what is telemetry data? I said every time you login, I get data about you (first name, last name, email, blah blah). It freaked out half the group. Only one person was educated enough to know which of the supported auth systems had the least amount of data being sent. So again, this brings up that interesting point I started this paragraph with…do you know what apps you have simply just clicked on the "authorize" button just so you could use the app? I bet you don't, my god, what have you done? Think about all those apps across all your services (facebook, twitter, liveid, google, etc). So in an effort to help you understand just how bad this is going to get with O365, based on how bad it has gotten in a general sense, here is the links to where you can check what you have authorized for each of the major platforms:

Facebook:

https://www.facebook.com/settings?tab=applications

Twitter:

https://twitter.com/settings/applications

Google:

https://security.google.com/settings/security/permissions?pli=1

LiveID:

https://account.live.com/consent/Manage

eBay:

http://cgi6.ebay.com/ws/eBayISAPI.dll?ManageESubscriptions

Yammer:

https://www.yammer.com/YOURDOMAIN/account/applications

SharePoint:

I suppose you could say that this page (http://youweburl/_layouts/15/viewlsts.aspx) is as close as it gets? At a farm level, in O365:

https://yourdomain-admin.sharepoint.com/_layouts/15/TA_AllAppPrincipals.aspx

On-premises:

http://centraladminurl/_admin/AllAppPrincipals.aspx

Azure AD:

https://myapps.microsoft.com

Back to the question…"My god, what have you done?" After clicking on all those links, I'm sure you have an idea of just how bad the problem has gotten. Some of you might be surprised to find out that your Hotmail/Outlook.com email and OneDrive (consumer) files have probably already been downloaded and analyzed by that cool app you gave permissions too. Most of you that read my blogs are the cream of the intellectual crop. You understand what that "Trust" dialog means. For the rest of the population, not so much. The meaning of the word "Trust" has been eroded by fancy graphics and the promise of a simple functionality at the expensive of selling your data soul to the app provider. That all being said, its time for enterprise solutions to have a "white list" to keep users from eroding the meaning of "Trust", to anyone's dog or cat. One last thing on O365, no matter how secure you make the back end, its the front end that matters.

CJG

May 30, 2014

Azure Access Control Services (ACS) is dead

I recently went through the exciting process of setting up our SanSpug.org site to support various authentication mechanisms (LiveID, Google, Federated Auth, etc). I started off using Azure Access Control Services (ACS) because I thought it had everything I could ever want in a login aggregation platform. However, I soon realized that it just wasn't able to meet the simple needs of a small SharePoint User Group. And now that I know it can't meet those needs, I doubt it really meets the needs of large organizations either.

A bit about ACS

To be fair, ACS was the first of its kind. I remember when it was first released, it was sooooo cool. It really was. I built some cool labs off of it (the old UI), then the UI "disappeared" (at least, the link was gone from Azure at one point), only to reappear again in the latest incarnations. ACS isn't a part of the main Azure portal. It has its own interface, which has confused me to this day being that all the other services are moving into the current portal. Here's a screen shot of the two:

Why they are separate probably has to do with (put some random excuse here, but likely because of "legacy" configurations). It would make much more sense that Azure added ACS as its own separate application in the Azure Portal.

So what can we do with ACS? If you are familiar with ADFS, then you get the point. I can created "Identity Providers", "Relaying party applications" and claims rules between the two. The latter being one of the strengths of ACS (Quick history lesson, ADFS 1.0 was a real piece of junk, that left only one option, that option being ACS. It could do all the things that ADFS 2.0 was about to do). So what kinds of identity providers can we add? Well, here they are:

Hmm, interesting. So they have preconfigured providers for LiveID, Google and Yahoo. Fair enough, those don't actually need an application (client id and client secret) created for them. But "Facebook". Ummm, that one needs an app created for it to do federated auth. Ok, let's do it. Added a facebook app used to be fairly easy back in the day, now the UI sucks. And the process to get a simple "App Stub" created requires a canvas page? Whatever. Fail on facebook's part. Ok, back to ACS. So they are allowing us to do federated auth to OAuth providers. Ok. So how do I add others? Like Twitter? Maybe Yammer? Oh…YOU CAN'T! But you can do WS-Federation all day long. Boring. Oh, did I mention the Google one doesn't work anymore? Yeah, Google seems to have disabled the interface that ACS was using.

Ok, so ACS is dead "to me". So what do I do now? Do I build my own? Ugg, that means I need to register an app in EVERY possible OAuth provider on the planet. Similar hoops to jump through (approve the app, screen shots, canvas page, terms…ugg, no thank you).

Let's do some research, maybe someone built a better ACS? Maybe someone did all that work to register an app EVERYWHERE? After a few google searches, some tweets…I find….Auth0.com.

Auth0 is the ACS killer. I have no doubt that after the right people at Microsoft see what they have built, Auth0 will be picked up in a M&A transaction. I'm going to tell the guys to hold out and get 4-5 offers and bid the price up. I have no doubt, they will go for $100Ms in a liquidity event. So why are they so cool? Because you can add ANYTHING! Check this $^&%^% out! Database, Social and Enterprise:

Database (think ASPNETDB – this is a biggy for SP2015 by they way):

Social (oh right…ACS gives us…Facebook…totally lame):

Enterprise:

Wait…do you see the one at the bottom? Yeah…that's SharePoint Apps for on-premises (courtesy of Chris Beckett consulting services). Oh…do you see the O365 and Windows Azure AD one? Oh yeah…that's Azure AD Apps…umm, so why are you using ACS again?

Telemetry and Metrics

Are you freaking kidding me? ACS would never have thought this up. You get telemetry on when and how your users are logging in:

The logs are also pretty sweet:

You see the profile JSON response in the Logs that contains the Access Token (if provided) so if you need to debug something, you have everything you need!

Wait…it gets better (but maybe a bit confusing for you that haven't been doing this stuff for a while). In addition to allowing your Apps access to all these federated OAuth platforms, it itself is an App registry. You can create your App in the Auth0 interface, which is then exposed as a WS-Federation end point!!! Holy $%&^&$^! Forced to use ACS, but you think it sucks like I do? Hey…add your Auth0 WS-Federation endpoint, and just like one of those "I saw it on TV adds", "Set it and forget it!":

APIs:

Oh baby…if they weren't already ahead of the competition. This is the future. Forget about all those APIs you have to write to and learn. Why bother with the auth parts? All you need is to call the API and get that JSON response back. Here's what they support:

Rules:

Do you have some ACS rules? Yeah, they support that too…what they don't have is an ACS rule importer. That would be so slick. Easily migrate from crappy old ACS, to shiny new Auth0:

Custom Emails:

Just to add insult to injury…woah…custom emails…custom whatever! When someone hits your fed auth endpoint…send em an email based off the claims!

Is ACS Dead? Yeah, to me it is. To the masses that didn't know any better, consider yourself educated. As the word spreads, I'm sure there will be a mass ACS exodus very soon!

Chris

April 29, 2010

SharePoint 2010 has explicit deny!

What? Are you serious? The DenyPermsMask column is used now??? Oh yeah my friends, it can be used now! Unfortunately, it can't be used at the site/web level. It is in the web application policy level where you can now specify the explicit deny on permissions. We have been waiting for this since SP1 of SharePoint 2007!

You can configure explicit deny on the SharePoint Central Administration by:

Open Central administration
Click Applicaiton Management
Select a web application
In the ribbon click Permission Policy
Click "add Permission Policy Level"
Give it a name like "My Deny Policy"
Notice that all the site level permission are displayed with a Grant and a Deny check box! Cool!

Chris