25 Sep 2024
Some notes I made on building a python lambda package, because I spent far too much time on it this week. In all these examples I’m using a requirements.txt
to install dependencies into a lambdaPackage
directory.
Install packages that are compatible with the lambda environment
We need to make sure pip
downloads packages that will be compatible with the lambda runtime environment. AWS suggest using the --platform
and --only-binary
flags when installing dependencies:
$ pip install --platform manylinux2014_x86_64 --only-binary=:all: --requirement requirements.txt --target lambdaPackage
The problem I found with this approach is if one of the dependencies doesn’t have a release that matches those parameters (compare orjson with pyleri for example).
As an alternative, we can run pip inside the AWS SAM build container images, which should ensure we get a compatible set of packages:
$ docker run -it --rm --entrypoint='' -v "$(pwd):/workspace" -w /workspace public.ecr.aws/sam/build-python3.10 pip install -r requirements.txt -t lambdaPackage
Don’t accidentally include a top level folder inside the zip file.
If we create the archive with zip lambdaPackage.zip lambdaPackage/*
we end up with a zip file that includes the lambdaPackage
directory. This was actually quite annoying to troubleshoot, because when you extract the zip file in macOS it will create this enclosing folder for you if it doesn’t exist (instead of spewing files all over the current folder), so it took me a while to realise what was happening.
To add only the contents of lambdaPackage
to the archive, we want to run the zip
command from inside the lambdaPackage
directory:
$ pushd lambdaPackage
$ zip ../lambdaPackage.zip .
$ popd
You can also use the zipinfo
command to examine the contents of the archive.
Check File Permissions
Lambda needs to have permission to read your files and dependencies. The expected permissions for python projects are 644
for files and 755
for directories. You can update everything with:
$ chmod 644 $(find lambdaPackage -type f)
$ chmod 755 $(find lambdaPackage -type d)
Slim down the dependencies
The maximum size of a lambda package is 250MB (unzipped), including all lambda layers. There are a few things we can do to slim down the resulting package that shouldn’t impact the lambda environment.
Running pip with PYTHONPYCACHEPREFIX=/dev/null
will discard all the __pycache__
files out (possibly at the expense of slower cold start times). Since we won’t be running pip
again it’s also usually safe to delete all the .dist-info
files.
Some of the biggest wins can be found with modules like googleapiclient
(75MB - about 30% of our allowance!) which include large model files describing each service they support. In this case it should be safe to delete the model file for any services we won’t be using. You’ll find them in googleapiclient/discovery_cache/documents/
. Botocore had a similar issue, but since 1.32.1
now stores these model files compressed.
Precompiling modules
One thing I didn’t try was precompiling the dependencies. AWS actually advises against this but as long as it’s run inside the appropriate AWS SAM build container images the result should be compatible with the lambda runtime, and could speed up cold start times.
13 Aug 2024
In our AWS environment, no hosts are exposed to the internet in public subnets. Instead, we use a feature of AWS Systems Manager if we need to connect to instances. AWS Systems Manager Session Manager works by running an agent on each instance which opens a connection back to the Systems Manager service. To connect to an instance, someone with the appropriate IAM permissions can use the aws ssm start-session
command, or the connect button in the AWS console. This connects to Systems Manager over HTTPS
, which sends a message to the agent on the instance to set up a bi-directional tunnel between the client and the instance. Commands can then be sent over this tunnel to the agent, and output is sent back over the tunnel to the client.
This is usually all we need to connect to instances for troubleshooting etc, but occasionally we may need to transfer a file, and to do that we need to use another feature of Session Manager which allows us to forward ports from the remote instance back to our client. Conceptually this is much the same as when we use SSH to forward a remote port with ssh -L 8080:localhost:80
- port 80
on the remote instance can be accessed on our local machine on port 8080
. In the case of Session Manager, the connection is tunnelled over HTTPS
instead of SSH
, but the result is the same.
To upload a file to the instance we want to use the scp
command.
First we update our .ssh/config
file to tell ssh
and scp
to use a proxy command when connecting to instances.
Host i-*
IdentityFile ~/awskeypair.pem
User ec2-user
ProxyCommand sh -c "aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'"
Since we will be connecting to the sshd
service on the instance (instead of the SSM Agent), we need to have valid credentials to log in - in this case the private key used to launch the instance. You could also leave out the IdentityFile
and User
settings and instead pass in the appropriate authentication options on the commandline when you connect.
We should then be able to connect to the instance over ssh by using:
Or transfer files with:
scp somefile.txt i-0123456abcdef:/tmp/somefile.txt
It’s worth noting that when we do this, the SSM Agent is no longer able to save session logs to CloudWatch or S3 for us, since it doesn’t have access to the encrypted SSH traffic.
18 Jul 2023
I recently had to use terraform to add a capacity provider to an existing ECS cluster.
After adding a default capacity provider to the cluster, existing services still have a launch_type=EC2
, so we need to update them to use a capacity_provider_strategy
in order to use it. Unfortunately we can’t do this in terraform due to a long-standing bug:
When an ECS cluster has a default_capacity_provider_strategy
setting defined, Terraform will mark all services that don’t have ignore_changes=[capacity_provider_strategy]
to be recreated.
The ECS service actually does support changing from launch_type
to capacity_provider_strategy
non-destructively, by forcing a redeploy. Since this uses the service’s configured deployment mechanism there’s no disruption.
We can also set this using the CLI:
aws ecs update-service --cluster my-cluster --service my-service --capacity-provider-strategy capacityProvider=ec2,weight=100,base=0 --force-new-deployment
If for some reason we need to revert, ECS also supports changing back from capacity_provider_strategy
to launch_type
, however the option is disabled in the console:
As a workaround, we can pass an empty list of capacity providers to the update-service
command, which will result in the service using launch_type=EC2
again.
aws ecs update-service --cluster my-cluster --service my-service --capacity-provider-strategy '[]' --force-new-deployment
06 Jul 2023
I spent a little time recenly working out how to get Home Assistant complications working on my watch. Once you work out the inputs they want they seem to be really reliable, and they also function as a quick way to open the watch app to trigger actions and scenes.
It’s important to pick the right complication for the watchface you are using. I used the “Graphic Circular” complication type, with the “Open Gauge Image” template. The Apple Developer Documentation has examples of all the different complication types and which watch faces support them.
Instead of trying to type in the Jinja2 templates on my phone, I found it was much easier to use the developer tools template editor inside Home Assistant. Once it’s working you can just copy and paste it into the mobile app.
We need to use template syntax to generate two values. The first is the sensor value. There is only just enough space for 3 digits, so I’ve rounded the value to 1 decimal place:
{{ states("sensor.office_temperature") | float | round(1) }}
The other value we need is a value between 0.0
and 1.0
, representing the percentage that the gauge should be filled. Since I don’t know what the temperature range is going to be, I’m using max()
and min()
to ensure we always get a value between zero and 1, and that the gauge fills as we approach the target temperature.
{% set current = states("sensor.office_temperature") | float %}
{% set low = min(18.0, current) %}
{% set high = max(22.0, current) %}
{{ (current - low) / (high - low) }}
We also need to specify an icon - I used chair_rolling
to match my office dashboard. It’s a bit small but you can more or less make it out.
02 Jun 2023
I thought it might be useful if Home Assistant knew when I was on call. I could use this to make sure the office doesn’t get too cold overnight, or to send me a notification if I leave home without my laptop.
We use PagerDuty, which gives you an iCal calendar feed, so I assumed I could just use this. Unfortunately while Home Assistant has integrations for Local Calendars and CalDAV, neither of these support just fetching a single .ics
file over http.
After a bit of digging around I discovered that Home Assistant stores local calendars in the .storage
folder alongside its config files, so I figured I can just overwrite this file manually using a shell_command
. You need to create the calendar first (under Settings > Devices & Services > Add Integration > Local Calendar). Once it’s created, add an event to get Home Assistant to create the local calendar file.
The shell_command
goes into configuration.yaml
:
shell_command:
update_on_call_calendar: 'curl https://pagerduty.com/path/to/calendar > /config/.storage/local_calendar.on_call.ics'
We can then use the shell command in an automation, followed by homeassistant.reload_config_entry
to get Home Assistant to reload the file from disk. I have this running on an hourly time_pattern
trigger, but you could increase the update frequency for a calendar that changes more regularly.
alias: Refresh On Call Calendar
description: ""
trigger:
- platform: time_pattern
minutes: "0"
condition: []
action:
- service: shell_command.update_on_call_calendar
data: {}
- service: homeassistant.reload_config_entry
target:
entity_id: calendar.on_call
data: {}
mode: single
Once the calendar has updated you should see events show up in Home Assistant. The calendar state can be used in automations:
alias: On Call Laptop Check
description: "Send a push notification if I leave my laptop at home when I'm on call"
trigger:
- platform: state
entity_id:
- person.tom_henderson
to: not_home
condition:
- condition: and
conditions:
- condition: state
entity_id: calendar.on_call
state: "on"
- condition: device
device_id: <device_id>
domain: device_tracker
entity_id: device_tracker.toms_m2
type: is_home
action:
- device_id: <device_id>
domain: mobile_app
type: notify
message: You're on-call. Did you leave your laptop at home?
mode: single