# Container introduction

## Overview
Unwired Edge Cloud provides container virtualization for far-edge devices including
industrial routers, access points and industrial compute boxes (SBCs, compute modules,
COM express modules, ...).

This container virtualization allows running arbitrary application workloads on those
industrial devices, in a secure and efficient manner, without any OS level knowledge
or development requirements.

## Typical use cases

Typical use cases are:
- public WiFi portals
- media servers
- data storage
- IoT: processing, local aggregation and data uplink to the internet
- public transport:
  - passenger information (PIS)
  - automatic passenger counting (APC)
- remote management
  - VPN tunnels
  - local management IP address allocation for cameras, screens and other LAN devices
  - remote access to those devices for upgrades and management
- network monitoring (ICMP ping, SNMP, HTTP)
- security services and security monitoring (port scanning, intrusion detection, advanced firewalls)

You can see some demos within this documentation.

## Features
### Edge virtualization features
The edge virtualization feature consists of:
- an orchestration to deploy and run containers across a fleet of devices
- an orchestration to run multiple containers on a device with a container runtime
- a container runtime to run a container isolated from the rest of the system

### Base container features
Containers have several base features:
- able to run arbitrary OCI containers (see [](#container-limitations) below)
- isolation from the host OS and other container workloads
- access to persistent storage
- memory limits
- advanced network configuration
  - allows interfacing to LAN and WAN sides of routers at the same time
  - interfacing with multiple lan interfaces
  - automatically generated subnets for routed applications or roaming
  - automatic LAN and WAN network interface configuration
- per container instance config maps for instance configuration
- time good status (RTC and system time initialized and valid)
- access to local physical interfaces (serial bus, CAN, GPUs, neural processing engines,
  GPIOs, system management controllers)
- access to all network uplink options from Unwired Edge Cloud OS included transparently
  bonded uplinks stretching multiple arbitrary modems or uplink interfaces (CloudLink)
- automatically injected Unwired Edge Cloud API keys with specific roles
- monitoring through prometheus metrics (scraped up to 1 MB per container)

### Container lifecycle
Containers are completely lifecycle managed by the Unwired Edge Cloud:
- providing a new container version will automatically deploy that version to all affected devices
- container rollouts will be monitored for healthiness and rolled back automatically
- lifecycle of a container application is separate from the host OS lifecycle

(container-limitations)=
### Limitations
- the source image arch must match the target arch
  - currently supported architectures are: `amd64`, `armhf`, and `arm64`
- OCI containers are started with `tini` as pid 1, so they cannot launch an init
  system that requires pid 1 for itself (e.g. S6 Overlay is not supported)
- /run is mounted at container startup to a tmpfs (so it's content will be lost
  on container restart and also override anything provided under /run in the original
  image)
- `busybox reboot` does not work and leads to a stopped container. to properly reboot the container, `busybox reboot -f` must be used.
  - note that `alpine` linux images use `busybox` to provide `reboot` and other common system tools

## Data model

The orchestration consist of:
- a container image (OCI)
  - this is the actual source image
- a node service (Unwired Edge Cloud API)
  - this is the definition of how the images are run in a fleet of devices
  - includes network and other configuration
  - defines the current target version to be deployed
- for each deployed container: a node service instance (Unwired Edge Cloud API)
  - this is an actually deployed node service on a device
  - each of these instances can have custom config maps

The orchestration will primarily act through configuration changes to the device.

OCI images are imported to node services through the ```DM_import_node_service``` mutation. This
import step will load the OCI image from its source into the Unwired Edge Cloud and snapshot
the image there for further distribution to the devices.

Additionally in the ```DM_import_node_service``` mutation all other configuration settings
that affect the fleet deployment, like networking, access to local physical interfaces or memory
and cpu limits will be defined.

A node service is a deployment instruction on how to perform fleet deployments.

Node service instances are created by adding a container to a device. If automatic IP subnet
allocation is enabled for the node service, a subnet unique within the fleet will be
allocated. The only configurable option is to inject additional data via per instance config maps.

## Host-provided information and communication
```{warning}
The files `/lxc.env`, `/.time-status`, `/prometheus.prom` and `/host/ubus.sock` are deprecated and must
not be used, they will be removed in future releases.
```

### Time good status

Within the container *non-empty* content in the file `/host/.time-status` indicates that the
system time was externally synced via NTP.
This is important for many services, since a non-synced system time can result in arbitrary and
huge time jumps into the future or the past.

This typically happens because the RTC is not initialized.

(lxc-env)=
### lxc.env for environment variables

Every environment variable injected into the node_service in the import mutation will end up
in the file `/host/lxc.env` and contain those variables in a way where the file can be
sourced by any shell (e.g. ```. /host/lxc.env```).

### Prometheus metrics
Container runtime will copy contents of `/host/prometheus.prom` every minute and make it available through [metrics API](../apis/api_metrics.md).

```{caution}
Make sure to write file contents atomically to avoid partial reads by the runtime.
```

## Network configurations

LAN and WAN networking allow different kinds of network configurations:

WAN networking (option ```wan_configure_network```):
- ```disabled``` (the orchestration will not perform any network configuration)
- ```allocated``` (the orchestration will assign an IP from the client network uplink)
- ```dhcp``` (the orchestration will run a dhcp client on the WAN interface)
  - this is mainly used if the container's WAN uplink is bridged to 3rd party VM
    DHCP servers in the network

LAN networking (option ```lan_configure_network```):
- ```disabled``` (the orchestration will not perform any network configuration)
- ```allocated``` (the orchestration will assign an IP from the provided client network)
  - The IP address will be statically allocated.  If a network prefix was provided, the first available IP address will be allocated.  
    Examples are:
    - subnet `192.168.0.0/24` -> `192.168.0.1` will be allocated for the LAN interface
    - subnet `192.168.0.100/24` -> `192.168.0.100` will be allocated for the LAN interface
- ```dhcp``` (the orchestration will run a dhcp server on the first LAN interface)
  - this will use either the subnet provided or the subnet allocated from the DHCP
    range template, if the feature is used

```{caution}
Network configuration (```node_service_input.network_config```) can only be set during the initial import and is immutable. Subsequent imports will inherit it from the initial import.
```

### DHCP range templates

LAN interfaces can be:
- allocated to a fixed subnet
- allocated from a subnet template by the orchestration
  - this feature allows to automatically allocate unique subnets per container instance
    across a whole fleet

In the ```node_service_input``` of the import mutation, there are these two options to control the behavior:

```
node_service_input {
  ...
  access_provided_subnet: "10.190.0.1/16"
  dhcp_range_template: "10.190.0.1/24"
  ...
}
```

This will allocate a unique /24 for each container instance within the /16 subnet. This results in:
- the interface will be configured with the /16 subnet
  - for ```lan_configure_network``` set to ```allocated``` and ```dhcp```
- the dhcp server will be configured with address ranges from the /24 subnet
  - only if ```lan_configure_network``` is set to ```dhcp```

### DHCP one subnet for all instances of a node_service

For all instances of the same node_service the same DHCP range and subnet will be used like this:

```
node_service_input {
  ...
  access_provided_subnet: "10.190.0.1/16"
  dhcp_range: "10.190.0.1/24"
  ...
}
```

This will allocate the single provided /24 for each container instance within the /16 subnet. This results in:
- the interface will be configured with the /16 subnet
  - for ```lan_configure_network``` set to ```allocated``` and ```dhcp```
- the dhcp server will be configured with the single provided /24 subnet
  - only if ```lan_configure_network``` is set to ```dhcp```

### Forwarding/NAT/masquerading

If both LAN and WAN networking are not disabled, the orchestration can optionally
insert an automatic MASQUERADE rule to forward and masquerade/NAT traffic from LAN
to the WAN interface (which is a very common router scenario and required for devices
connected to the LAN interface to connect to the internet).

## Private registry authentication

If your source image is private or otherwise requires some form of authentication, then you must supply it in the ```image_sources``` as part of the input mutation.

For your reference here are couple of common examples:

### Google Container Registry

To use standard registry authentication with the GCR, there are several options:

- Use a JSON keyfile as the password and _json_key as the user
- Use a short-lived access token obtained from ```gcloud auth print-access-token``` as the password and ```oauth2accesstoken``` as the user

An example ```authn``` with ```eu.gcr.io``` via short-lived access token:
```
image_sources: [
  {
    arch: amd64
    image_reference: "eu.gcr.io/your-account/your-image:latest"
    authn: {
      registry_user: "oauth2accesstoken"
      registry_pw: "your.accesstoken.here"
    }
  }
]
```

For further details, see [Google Container Registry - Advanced Authentication](https://cloud.google.com/container-registry/docs/advanced-authentication).

### GitHub Container Registry

To use standard registry authentication with the GHCR, there are several options:

- Use your GitHub Username as user and PAT/Personal Access Token (Classic) as password (ensure it has ```read:packages``` scope)
- If you are importing the image as part of a GitHub action, use your GitHub Username as user and use the value from ```GITHUB_TOKEN``` env as password

An example ```authn``` with ```ghcr.io``` via PAT:
```
image_sources: [
  {
    arch: amd64
    image_reference: "ghcr.io/your-account/your-image:latest"
    authn: {
      registry_user: "your-account"
      registry_pw: "your.pat.goes.here"
    }
  }
]
```

For further details, see [GitHub Container Registry - Authenticating to the container registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#authenticating-to-the-container-registry).

## Orchestration

### Update time windows

Please see the [firmware guide for setting update time windows](../apis/api_rollout_configuration.md#setting-update-time-windows), as  ```node_service_instances``` follow rollout ```time_span``` of their devices.

### Version Pinning

Similar to the [firmware rollout configuration](../apis/api_rollout_configuration.md), ```node_service_instances``` can also be pinned to a specific version:

```graphql
mutation DM_update_node_service_instance_rollout_config {
  DM_update_node_service_instance_rollout_config(
    node_service_instance_ids: ["abcdefgh"]
    version_target_selector: { value: "version:201" }
  )
}
```

The current policies for container version target selectors include:
- ```null```: always use the latest version available for that container
- ```version```: set a single specific version (for fully externally controlled version selections)
- ```disabled```: disable orchestration of specified ```node_service_instances```

To get a valid ```version``` to use in the above mutation, you can query all currently available ```node_services``` using ```DM_get_node_services```:
```graphql
query DM_get_node_services {
  DM_get_node_services(customer_id: "102e88a2-86cf-4a2d-8712-99e8e652db48", take: 10) {
    services {
      name
      node_service_id
      container_version
      container_image
    }
  }
}
```