Notes on Packaging Python Lambdas

Some notes I made on building a python lambda package, because I spent far too much time on it this week. In all these examples I’m using a requirements.txt to install dependencies into a lambdaPackage directory.

Install packages that are compatible with the lambda environment

We need to make sure pip downloads packages that will be compatible with the lambda runtime environment. AWS suggest using the --platform and --only-binary flags when installing dependencies:

$ pip install --platform manylinux2014_x86_64 --only-binary=:all: --requirement requirements.txt --target lambdaPackage

The problem I found with this approach is if one of the dependencies doesn’t have a release that matches those parameters (compare orjson with pyleri for example). As an alternative, we can run pip inside the AWS SAM build container images, which should ensure we get a compatible set of packages:

$ docker run -it --rm --entrypoint='' -v "$(pwd):/workspace" -w /workspace pip install -r requirements.txt -t lambdaPackage

Don’t accidentally include a top level folder inside the zip file.

If we create the archive with zip lambdaPackage/* we end up with a zip file that includes the lambdaPackage directory. This was actually quite annoying to troubleshoot, because when you extract the zip file in macOS it will create this enclosing folder for you if it doesn’t exist (instead of spewing files all over the current folder), so it took me a while to realise what was happening. To add only the contents of lambdaPackage to the archive, we want to run the zip command from inside the lambdaPackage directory:

$ pushd lambdaPackage
$ zip ../ .
$ popd

You can also use the zipinfo command to examine the contents of the archive.

Check File Permissions

Lambda needs to have permission to read your files and dependencies. The expected permissions for python projects are 644 for files and 755 for directories. You can update everything with:

$ chmod 644 $(find lambdaPackage -type f)
$ chmod 755 $(find lambdaPackage -type d)

Slim down the dependencies

The maximum size of a lambda package is 250MB (unzipped), including all lambda layers. There are a few things we can do to slim down the resulting package that shouldn’t impact the lambda environment.

Running pip with PYTHONPYCACHEPREFIX=/dev/null will discard all the __pycache__ files out (possibly at the expense of slower cold start times). Since we won’t be running pip again it’s also usually safe to delete all the .dist-info files.

Some of the biggest wins can be found with modules like googleapiclient (75MB - about 30% of our allowance!) which include large model files describing each service they support. In this case it should be safe to delete the model file for any services we won’t be using. You’ll find them in googleapiclient/discovery_cache/documents/. Botocore had a similar issue, but since 1.32.1 now stores these model files compressed.

Precompiling modules

One thing I didn’t try was precompiling the dependencies. AWS actually advises against this but as long as it’s run inside the appropriate AWS SAM build container images the result should be compatible with the lambda runtime, and could speed up cold start times.

Tunnelling SSH over AWS SSM Session Manager

In our AWS environment, no hosts are exposed to the internet in public subnets. Instead, we use a feature of AWS Systems Manager if we need to connect to instances. AWS Systems Manager Session Manager works by running an agent on each instance which opens a connection back to the Systems Manager service. To connect to an instance, someone with the appropriate IAM permissions can use the aws ssm start-session command, or the connect button in the AWS console. This connects to Systems Manager over HTTPS, which sends a message to the agent on the instance to set up a bi-directional tunnel between the client and the instance. Commands can then be sent over this tunnel to the agent, and output is sent back over the tunnel to the client.

SSM Session Manager

This is usually all we need to connect to instances for troubleshooting etc, but occasionally we may need to transfer a file, and to do that we need to use another feature of Session Manager which allows us to forward ports from the remote instance back to our client. Conceptually this is much the same as when we use SSH to forward a remote port with ssh -L 8080:localhost:80 - port 80 on the remote instance can be accessed on our local machine on port 8080. In the case of Session Manager, the connection is tunnelled over HTTPS instead of SSH, but the result is the same.

To upload a file to the instance we want to use the scp command.

First we update our .ssh/config file to tell ssh and scp to use a proxy command when connecting to instances.

Host i-*
  IdentityFile ~/awskeypair.pem
  User ec2-user
  ProxyCommand sh -c "aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'"

Since we will be connecting to the sshd service on the instance (instead of the SSM Agent), we need to have valid credentials to log in - in this case the private key used to launch the instance. You could also leave out the IdentityFile and User settings and instead pass in the appropriate authentication options on the commandline when you connect.

We should then be able to connect to the instance over ssh by using:

ssh i-0123456abcdef

Or transfer files with:

scp somefile.txt i-0123456abcdef:/tmp/somefile.txt

It’s worth noting that when we do this, the SSM Agent is no longer able to save session logs to CloudWatch or S3 for us, since it doesn’t have access to the encrypted SSH traffic.

Adding a Capacity Provider to an ECS Cluster with Terraform

I recently had to use terraform to add a capacity provider to an existing ECS cluster.

After adding a default capacity provider to the cluster, existing services still have a launch_type=EC2, so we need to update them to use a capacity_provider_strategy in order to use it. Unfortunately we can’t do this in terraform due to a long-standing bug:

When an ECS cluster has a default_capacity_provider_strategy setting defined, Terraform will mark all services that don’t have ignore_changes=[capacity_provider_strategy] to be recreated.

The ECS service actually does support changing from launch_type to capacity_provider_strategy non-destructively, by forcing a redeploy. Since this uses the service’s configured deployment mechanism there’s no disruption.

ECS Compute Configuration

We can also set this using the CLI:

aws ecs update-service --cluster my-cluster --service my-service --capacity-provider-strategy capacityProvider=ec2,weight=100,base=0 --force-new-deployment

If for some reason we need to revert, ECS also supports changing back from capacity_provider_strategy to launch_type, however the option is disabled in the console:

ECS Compute Configuration

As a workaround, we can pass an empty list of capacity providers to the update-service command, which will result in the service using launch_type=EC2 again.

aws ecs update-service --cluster my-cluster --service my-service --capacity-provider-strategy '[]' --force-new-deployment

Home Assistant Apple Watch Complications

I spent a little time recenly working out how to get Home Assistant complications working on my watch. Once you work out the inputs they want they seem to be really reliable, and they also function as a quick way to open the watch app to trigger actions and scenes.

Apple Watch with Home Assistant Complication

It’s important to pick the right complication for the watchface you are using. I used the “Graphic Circular” complication type, with the “Open Gauge Image” template. The Apple Developer Documentation has examples of all the different complication types and which watch faces support them.

Instead of trying to type in the Jinja2 templates on my phone, I found it was much easier to use the developer tools template editor inside Home Assistant. Once it’s working you can just copy and paste it into the mobile app.

We need to use template syntax to generate two values. The first is the sensor value. There is only just enough space for 3 digits, so I’ve rounded the value to 1 decimal place:

{{ states("sensor.office_temperature") | float | round(1) }}

The other value we need is a value between 0.0 and 1.0, representing the percentage that the gauge should be filled. Since I don’t know what the temperature range is going to be, I’m using max() and min() to ensure we always get a value between zero and 1, and that the gauge fills as we approach the target temperature.

{% set current = states("sensor.office_temperature") | float %}
{% set low = min(18.0, current) %}
{% set high = max(22.0, current) %}
{{ (current - low) / (high - low) }}

We also need to specify an icon - I used chair_rolling to match my office dashboard. It’s a bit small but you can more or less make it out.

Syncing an On Call Calendar to Home Assistant

I thought it might be useful if Home Assistant knew when I was on call. I could use this to make sure the office doesn’t get too cold overnight, or to send me a notification if I leave home without my laptop.

Home Assistant Notification

We use PagerDuty, which gives you an iCal calendar feed, so I assumed I could just use this. Unfortunately while Home Assistant has integrations for Local Calendars and CalDAV, neither of these support just fetching a single .ics file over http.

After a bit of digging around I discovered that Home Assistant stores local calendars in the .storage folder alongside its config files, so I figured I can just overwrite this file manually using a shell_command. You need to create the calendar first (under Settings > Devices & Services > Add Integration > Local Calendar). Once it’s created, add an event to get Home Assistant to create the local calendar file.

The shell_command goes into configuration.yaml:

  update_on_call_calendar: 'curl > /config/.storage/local_calendar.on_call.ics'

We can then use the shell command in an automation, followed by homeassistant.reload_config_entry to get Home Assistant to reload the file from disk. I have this running on an hourly time_pattern trigger, but you could increase the update frequency for a calendar that changes more regularly.

alias: Refresh On Call Calendar
description: ""
  - platform: time_pattern
    minutes: "0"
condition: []
  - service: shell_command.update_on_call_calendar
    data: {}
  - service: homeassistant.reload_config_entry
      entity_id: calendar.on_call
    data: {}
mode: single

Once the calendar has updated you should see events show up in Home Assistant. The calendar state can be used in automations:

alias: On Call Laptop Check
description: "Send a push notification if I leave my laptop at home when I'm on call"
  - platform: state
      - person.tom_henderson
    to: not_home
  - condition: and
      - condition: state
        entity_id: calendar.on_call
        state: "on"
      - condition: device
        device_id: <device_id>
        domain: device_tracker
        entity_id: device_tracker.toms_m2
        type: is_home
  - device_id: <device_id>
    domain: mobile_app
    type: notify
    message: You're on-call. Did you leave your laptop at home?
mode: single