DEV Community

Brandon Allmark
Brandon Allmark

Posted on

Cloud Resume Challenge pt 3: Exploring CosmoDB and Database Security

Please check out my resume at resume.allmark.me!
All feedback, good or bad is appreciated πŸ˜„

Progress thus far:

  1. Certification βœ”οΈ
  2. HTML βœ”οΈ
  3. CSS βœ”οΈ
  4. Static Website βœ”οΈ
  5. HTTPS βœ”οΈ
  6. DNS βœ”οΈ
  7. JavaScript 🚧
  8. Database 🚧
  9. API 🚧
  10. Python 🚧
  11. Python Tests ❌
  12. Infrastructure as Code 🚧
  13. Source Control ❌
  14. Backend CI/CD ❌
  15. Frontend CI/CD ❌
  16. Blog Post 🚧

TL;DR
The user is working on a project where they want to track website visitors. They started with a simple visitor counter using the localStorage object, but this didn’t meet their needs. They decided to use CosmoDB for their database and created a Bicep script to deploy it. They also created a Function App to interact with the database.

However, they encountered difficulties with securing the Function App and API keys. After researching various solutions, they decided to migrate their website from Storage Accounts to an actual static webapp. They noticed that Azure Static Webapps has a preview feature that allows a direct connection to an Azure DB with built-in security, which would simplify their design. They also set up a pipeline between their GitHub and their Static Web App, so that committing to their repo triggers a GitHub action Workflow that deploys the code to their website.

They spent most of their time reading documentation and troubleshooting, and plan to detail their process in a future blog post. They also plan to find another way to incorporate Python into their project.

As complexity in this project is quickly ramping up, I opted to draw.io a map before proceeding. While mapping this out I also decided I wanted to receive an email every time someone visited my website, what their IP was and how many times they've visited.
Map
To get myself off to a running start, I opted to start with a visitor counter as simple as possible and then add complexity. My first draft looked like this:
counter
It stores information via the localStorage object. This means the counter is browser specific and would be cleared if the user were to clear their browser data. While it doesn't hit the criteria, the visuals gave me an idea of what to setup next:

  1. A database to store the counter’s tally.
  2. An API to interact with the database safely.

I chose to use CosmoDB for my database as I’ve never worked with it before and wanted to get a better understanding of how it worked.

In this project, I plan to deploy as many resources as possible using Bicep. My favourite way to do this is to first spin something up via the Azure Portal and then parse the resulting ARM template. The ARM template for CosmoDB is refreshingly simple too.

Using an ARM template as a reference, and combining that with VS Code’s IntelliSense, writing Bicep feels intuitive and satisfying. I actually enjoy it.
My first attempt resulted in the below error, I caused it by placing the location property in the wrong spot...
error
This is what my Bicep looked like in the end:

param dbname string
param location string = resourceGroup().location
param primaryRegion string
resource cosmodb 'Microsoft.DocumentDB/databaseAccounts@2024-02-15-preview' = {
  name: dbname
  location: location
  properties: {
    databaseAccountOfferType:'Standard' 
        locations: [
    {
      failoverPriority: 0
      locationName: primaryRegion
      isZoneRedundant: false
    }
      ]
    backupPolicy:{
      type: 'Continuous'
      continuousModeProperties:{
        tier:'Continuous7Days'
      }
    }
    isVirtualNetworkFilterEnabled:false
    minimalTlsVersion:'Tls12'
    enableMultipleWriteLocations:false
    enableFreeTier: true
        capacity:{
      totalThroughputLimit: 1000
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

The most important part of this code is totalThroughputLimit as it keeps me in the free tier.

Inside my new database, I created a new container called β€˜VisitorCounter’ and created a list for my visitors. This is where I learned what a Partition Key was. A partition Key is basically the UID for entries in a database. As I wanted visit counts to be unique to each IP address, I set that as my Partition Key.

As the plan was to record Public IPs, I did some research into the legalities and ethics surrounding this and came to the conclusions:

  1. A guy on StackOverflow said its fine as long as I don’t do anything malicious with them.
  2. I’m going to ignore Copilot’s warnings about privacy and GDPR.
  3. I can collect them using an Azure Function.

Writing the Function App Bicep was not refreshingly simple. It was leagues more difficult than the CosmoDB Bicep. When deploying resources via the Portal, you lose appreciation for all the supporting child resources that get spun up alongside the parent resource. All those supporting resources need to be configured individually and then mapped to the parent resource. Each come with their own unique properties and requirements so a simple portal deployment can turn into a complex Bicep one.

As mentioned earlier, I’ll usually create a resource in the portal and then go through the ARM template but in this case the resulting template felt like a bit of a maze, so I decided to build bottom up rather than reverse engineer top down.
First draftThis was my first draft and it actually worked.. Except it deployed an app service plan.. Not a Function App. Fast forward a few hours of trial n error with Microsoft Learn, I didn't get much further beyond this. Eventually I gave up and googled it with the specific intention of avoiding Microsoft Learn.
This landed me on this blog and about an hour later, we had success!

It did leave me with some mild frustration caused by the fact that I am simply perplexed as to how this person figured it out. I attempted to reconcile what they discussed with the Microsoft documentation and I was left scratching my head. While Copilot did a lot of heavy lifting to help me understand their code, this would’ve been a moment where I refer to my colleagues for a sanity check or Microsoft Support for reassurance.

Something I learned though is 'Action' operations in Bicep. In the below example, I used listKeys to get the keys of a storage account I just deployed.

param location string = resourceGroup().location
param name string = 'beeresumequery'

resource storageaccount 'Microsoft.Storage/storageAccounts@2023-04-01' = {
  name: '${name}storage'
  location: location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'StorageV2'
}
var StorageAccountPrimaryAccessKey = listKeys(storageaccount.id, storageaccount.apiVersion).keys[0].value

resource appinsights 'Microsoft.Insights/components@2020-02-02' ={
  name: '${name}appinsights'
  location: location
  kind: 'web'
  properties:{
    Application_Type: 'web'
    publicNetworkAccessForIngestion:'Enabled'
    publicNetworkAccessForQuery:'Enabled'
  }
}
var AppInsightsPrimaryAccessKey = appinsights.properties.InstrumentationKey

resource hostingplan 'Microsoft.Web/serverfarms@2023-12-01' = {
  name: '${name}hp'
  location: location
  kind: 'linux'
  properties: {
    reserved:true
  }
  sku:{
    name: 'Y1' //Consumption plan
  }
}

resource ResumeFunctionApp 'Microsoft.Web/sites@2023-12-01' = {
  name: '${name}functionapp'
  location: location
  kind: 'functionapp'
  identity:{
    type:'SystemAssigned'
  }
  properties:{
    httpsOnly:true
    serverFarmId:hostingplan.id
    siteConfig:{
//      use32BitWorkerProcess:true //this allows me to use the FREEEEE tier
      alwaysOn:false
      linuxFxVersion: 'python|3.11'
      cors:{
        allowedOrigins: [
          'https://portal.azure.com'
        ]
      }
      appSettings:[
        {
          name: 'APPINSIGHTS_INSTRUMENTATIONKEY'
          value: AppInsightsPrimaryAccessKey
        }
        {
          name: 'APPLICATIONINSIGHTS_CONNECTION_STRING'
          value: 'InstrumentationKey=${AppInsightsPrimaryAccessKey}'
        }
        {
          name: 'AzureWebJobsStorage'
          value: 'DefaultEndpointsProtocol=https;AccountName=${storageaccount.name};EndpointSuffix=${environment().suffixes.storage};AccountKey=${StorageAccountPrimaryAccessKey}'
        }
        {
          name: 'FUNCTIONS_EXTENSION_VERSION'
          value: '~4'
        }
        {
          name: 'FUNCTIONS_WORKER_RUNTIME'
          value: 'python'
        }
        {
          name: 'WEBSITE_CONTENTSHARE'
          value: toLower(storageaccount.name)
        }
        {
          name: 'WEBSITE_CONTENTAZUREFILECONNECTIONSTRING'
          value: 'DefaultEndpointsProtocol=https;AccountName=${storageaccount.name};EndpointSuffix=${environment().suffixes.storage};AccountKey=${StorageAccountPrimaryAccessKey}'
        }
      ]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

With the Function App deployed, I next had to figure out how to create the interaction between this app and my website.

I created a http trigger using the inbuilt templates and given I had limited experience with Python I thought it best to take my time to understand what everything was before continuing. Taking the time now will pay dividends later when I want to amend or troubleshoot things.

import azure.functions as azfunc
#This imported the Azure Functions SDK. I've always visualised SDKs as a sort of Ikea flatpack box, except for programmers. I didn't think using an SDK would be this simple though. 
#The original template imports this as 'func' but I've changed it to 'azfunc' just to make it more clear that its the SDK and not a python shorthand. 
import logging
#straight forward
app = azfunc.FunctionApp(http_auth_level=azfunc.AuthLevel.FUNCTION)
#This creates an instance of the 'FunctionApp' class within the code. FunctionApp is basically a blueprint from the SDK for creating a "function app object". 
#The section in brackets () defines what level of authentication is needed. ANONYMOUS is no auth, FUNCTION requires the function key and ADMIN requires the master key. 
#What is a class? A class is the blueprint, it defines how an object is created. Providing structure and methods for performing a specific task.  
#What is an object? It is something that is built based on a blueprint. The objects below are HttpRequest and HttpResponse.
#By creating this instance, I don't need to define what those two objects actually are. Which is good because I wouldn't know how. 
@app.route(route="http_trigger1")
#This uses app.route as a decorator to define a route for the function app. So if a HTTP request is made to my function app followed by the trigger /http_trigger1, the below function will activate.
#What is a route? A route is a pathway that can be taken within an application. The route is functionappurl.com/http_trigger1
#What is a decorator? Decorators are sort've layered functions. Do this but also do that with it. E.g You can have 'Hello World!' and create a decorator for it that converts all letters to uppercase to produce 'HELLO WORLD!'.
def http_trigger1(req: azfunc.HttpRequest) -> azfunc.HttpResponse: 
#this defines the http_trigger1 function. It notes that it requires the HttpRequest object to function. 
#"-> azfunc.HttpResponse:" is something that is referred to as 'type hinting'. It advises that the expected response here is a HttpResponse
#What is Type Hinting? Type Hinting is something you add to your code to improve readability and to know what the intention of the code is. 
#The difference between commenting and Type Hinting is that Type Hinting can be used by some tools for error checking and debugging. They're kind of like comments but for your tools. 
#Imagine an interesting future where the Natural Language from comments could be used for Type Hinting.
#I expressed the above idea to Bing and then it showed me an example of a Natural Language comment being interpreted as a type hint. 
#Bing is just showing off now. 
    logging.info('Python HTTP trigger function processed a request.')
#Straight forward, performs a logging action. I assume the .info refers to the fact that this is just information, not an error message or anything. 
    name = req.params.get('name')
    if not name:
        try:
            req_body = req.get_json()
        except ValueError:
            pass
        else:
            name = req_body.get('name')
#This script is trying to get the β€˜name’ value from the request parameters. If β€˜name’ is not provided in the parameters, it then tries to get β€˜name’ from the JSON body of the request.
#If β€˜name’ is not in the JSON body or if the body is not valid JSON, name will be None.
#If β€˜name’ is found in either the parameters or the JSON body, it will be assigned to the name variable. If β€˜name’ is not found in either place, name will be None.
#So basically when a HTTP request is made to the function app url, it needs to include a parameter that defines a name. E.g "Name=Brandon". If there is no name then it'll check if there is one in the JSON body. If not found then nothing happens. 
    if name:
        return azfunc.HttpResponse(f"Hello, {name}. This HTTP triggered function executed successfully.")
    else:
        return azfunc.HttpResponse(
             "This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response.",
             status_code=200
        )
#The above is straight forward. The previous block was looking for a name because it wants to pass that name into this block. So it takes that parameter and places it into the {name} field.
#If there is no name then it tells you to include a name in the query string. 


#Running this code 
#HTTP Method : Get / POST (If using JSON body)
#Key = The URL of my functionapp
#Query parameters 'name:brandon' or no name
#Headers. None / Content-Type:application/json (If using JSON body))
Enter fullscreen mode Exit fullscreen mode

Copilot then created some simple JavaScript to test it.

<!DOCTYPE html>
<html>
<head>
    <title>Fetch Example</title>
</head>
<body>
    <button id="fetchButton">Fetch Data</button>
    <div id="data">Press the button to fetch data...</div>
    <script>
        document.getElementById('fetchButton').addEventListener('click', fetchData);
        async function fetchData() {
            document.getElementById('data').innerText = "Loading...";
            try {
                const response = await fetch('https://functionapp.azurewebsites.net/api/http_trigger1?code=1234');
                if (!response.ok) {
                    throw new Error(`HTTP error! status: ${response.status}`);
                }
                const data = await response.text();
                document.getElementById('data').innerText = data;
            } catch (error) {
                console.error('Error:', error);
                document.getElementById('data').innerText = error.message;
            }
        }
    </script>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

This test code simply pings my Function App and returns whatever message it gets back. In this case I received an error which ended up being CORS related. CORS or Cross Original Resource Sharing determines which domains are allowed to send queries to your Function App. It also allows a wildcard to allow all.
A successful query looked like this
Image success
Unfortunately, the keys to my Function App were written in the html itself and keeping them there was a non-starter. Even though CORS ensures that only my domain can make requests to my API, intuition tells me circumventing it would be trivial.
After doing some reading, there's a few things I need to do to get this right:

  1. CORS - βœ”οΈ
  2. HTTPS - βœ”οΈ
  3. Logging and Monitoring - βœ”οΈ
  4. Network Isolation – Needs investigation.
  5. Azure API Management – Needs investigation.
  6. Store Keys in a KeyVault - ❌

Azure API Management seemed interesting and it didn’t support Network Isolation so I started with it first.

In the Cloud Resume Challenge briefing material, it was mentioned that many people struggle to go beyond the initial stages of deploying the website. I began to understand why while writing the Bicep for the APIM resource. There are so many new concepts that need to be learned and deploying resources with Bicep adds another layer of difficulty on top.

Writing the Bicep took a lot longer than expected but I learned a lot on the way. Cool Bicep tricks like parameter objects, parameter arrays, create if not found, modules & outputs to name a few.

After spinning up the APIM and spending time clicking around on the resource, I figured I'd think about what my secure workflow would look like:

  1. My website queries my Key Vault for the keys to my API
  2. This works because my website is a managed identity and has read access to those keys
  3. My website then queries my API Service
  4. My API Service queries my Function App
  5. My Function App queries my Cosmo DB
  6. It then flows backwards into a result to my website

This doesn’t make sense. My goal is simply to have my Azure Storage Static Webapp securely interact with my database. Directly referencing your KeyVault from the front end is apparently bad practice and so is putting your function and API keys in your code. Copilot suggested I spin up a FunctionApp to talk to my vault so I can talk to my APIM resource that talks to my Function App.

The next few hours researching the best way to approach this can be summarised in the following:

  1. "You can securely access secrets by doing this…."
  2. "Actually, this isn't secure because people can still do this..."
  3. β€œJust use Azure Static Web App”

I know securing it in some form was possible as others who have completed this challenge have done it. But I set a rule that I wouldn't be copying others and instead research my own solutions. In the end, the conclusion I came to was that spending hours trying to make something work when it wasn't the best way to do it was insanity.

I decided to pivot and begin the process of migrating my website from Storage Accounts to an actual static webapp. Setting up the resource was uneventful. I did it via the Portal as I just wanted to move forward but I've made a note to write the Bicep for it when I'm not as exasperated.

I also noticed Azure Static Webapps has a preview feature that allows a direct connection to an Azure DB with built-in security, the same role-based security used to secure API endpoints. This means I won't need the Function App or an API Manager which significantly reduces the complexity of my design. I’ll find another excuse to use Python in this project as I do want to learn more about that language.

During the creation of my Static Web App, I took the opportunity to setup a pipeline between my GitHub and my Static Web App. This means committing to my repo will trigger a GitHub action Workflow that deploys the code to my website. Very satisfying!

My next blog post will go into how I did this.

Reflecting on the above, this one was an absolute slog. It’s hard to express it here but I spent about 80% of the time reading documentation and troubleshooting.

Top comments (0)