Cadvisor: container_start_time_seconds remains the same forever

Created on 12 Feb 2019  Â·  14Comments  Â·  Source: google/cadvisor

Problem
container_start_time_seconds is a constant which never changes not even after container restart.

Description
Despite we start 3 container at the same time: cadvisor, container_1, container_2
After 2min I run the following query:
(time()-container_start_time_seconds)/60 which will substract the start time from current time and give us the up time in minutes.
result:

container     up time 
container_1   2
container_2   2

Now we decide to shut down container_2 with docker container stop container_2
Now 'container_start_time_seconds' will be absent for container_2 which is ok.
Then we wait for example 3 minutes and restart container_2 again, then wait 1 minutes and run the query again.

Expected behaviour:

container     up time 
container_1   6
container_2   1

Current behaviour:

container     up time 
container_1   6
container_2   6

When checking the raw 'container_start_time_seconds' metric itself I have found that the second is a constant based on the first start up of the container. It doesnt matter that you will shutdown and dont run the container for the next 3 days. The start up seconds will remain the same forever.

Or maybe I misunderstand the use of this metric? If so please give an advice how can I get the current up time of a container?

thanks

areapi help wanted kinbug

Most helpful comment

will this got fix? we are currently using v0.25.0 version as container_start_time_seconds is useable for check whether we got container restarted, for this param, we can't upgrade cadviros yet. Or anyone know there is other metrics to aware container restart?

All 14 comments

I think we are just using whatever docker passes back to us. Can you query docker and see what docker thinks the start time of the container is?

After running docker inspect container_id it seems that the State.StartedAt updates successfully.
Creating and starting the container at 12:15 then stopping it at 12:29 then restarting at 12:30.

  {
      "Id": "150aceb457d66818288f9b0404eb8ea423b959521930dfa29b2ed43345d4d90b",
      "Created": "2019-02-14T12:15:18.7356321Z",
      "Path": "/postgres_exporter",
      "Args": [
          "--extend.query-path=/queries.yaml"
      ],
      "State": {
          "Status": "running",
          "Running": true,
          "Paused": false,
          "Restarting": false,
          "OOMKilled": false,
          "Dead": false,
          "Pid": 21225,
          "ExitCode": 0,
          "Error": "",
          "StartedAt": "2019-02-14T12:30:47.2003058Z",
          "FinishedAt": "2019-02-14T12:29:21.1354284Z"
...etc

Launching the (time()-container_start_time_seconds)/60 query at 12:31 shows 16min. Seems it fetches from the 'created' field.

Are you using kubernetes, swarm, or something else?

I think there might be a difference in the mechanism they use to "restart" containers on docker. I.E. I think kubernetes might actually create a new container each time, whereas swarm makes use of docker's restart policy.

The container_start_time_seconds metric was intended to work for uses like this.

rancher1.6.14 + cattles
cAdvisor version v0.32.0 (8949c822)
Docker version 1.12.6, build 78d1802

In my ENV, there is no ContainerID changed by restarting docker. then container_start_time_seconds is the same.

but if Updating container makes ID changed , container_start_time_seconds will be changed.

(time()-container_start_time_seconds) couldn't be used as container restart metric

Even we are hitting the same issue.

Ideally it should be like below

container_start_time_seconds should be renamed as container_create_time_seconds

And then container_start_time_seconds should be modified to use the proper start time from the docker daemon. This would work for both docker/swarm and k8 based installations.

At this point looks like container_start_time_seconds is primarily catered towards K8, where new pods are created.

I agree with @ravjanga that we should probably separate the two out into two metrics. The creation time is still very important, even for swarm users, as that has the start time of cumulative container metrics.

Actually I tried modifying the code to add start time as a new metric as it is already available as part of the docker spec that is being for collecting create time and other static metrics like memory limit.

It is working fine. But looks like this part of the code doesn’t get executed for every poll and hence restarts of the container are not picking up new start time.

Similar problem seems to exist if I update the docker memory limits after cadvisor is launched.

@ravjanga that makes sense, I forgot we only get the spec during container creation. I do worry about the impact on docker of calling its api once every ~15 seconds for each container.

I understand the impact, but atleast if some control can be provided like how often to query these parameters would definitely help. I tried making these changes, but getting into few other issues. Will share the details if I am able to get this working.

Start Time, memory limits are not static and can change over the life of the container.

Docker offers update of the below properties to be changed at runtime as well.

  --blkio-weight uint16        Block IO (relative weight), between 10 and 1000, or 0 to disable (default 0)
  --cpu-period int             Limit CPU CFS (Completely Fair Scheduler) period
  --cpu-quota int              Limit CPU CFS (Completely Fair Scheduler) quota
  --cpu-rt-period int          Limit the CPU real-time period in microseconds
  --cpu-rt-runtime int         Limit the CPU real-time runtime in microseconds

-c, --cpu-shares int CPU shares (relative weight)
--cpus decimal Number of CPUs
--cpuset-cpus string CPUs in which to allow execution (0-3, 0,1)
--cpuset-mems string MEMs in which to allow execution (0-3, 0,1)
--kernel-memory bytes Kernel memory limit
-m, --memory bytes Memory limit
--memory-reservation bytes Memory soft limit
--memory-swap bytes Swap limit equal to memory plus swap: '-1' to enable unlimited swap
--restart string Restart policy to apply when a container exits

will this got fix? we are currently using v0.25.0 version as container_start_time_seconds is useable for check whether we got container restarted, for this param, we can't upgrade cadviros yet. Or anyone know there is other metrics to aware container restart?

Any news??

Is there any workaround for getting info about container' restarts?

@ravjanga are you planning to work on this feature/bugfix? We are hitting the same issue as we run docker containers without the Kubernetes. Monitoring container restarts would be very useful for us.

This was addressed in my old company in a very different way with limited
code changes to cadvisor. Unfortunately I am not with that company anymore
and cannot share the details.

On Thu, Mar 19, 2020 at 7:43 AM Tom Bartek notifications@github.com wrote:

@ravjanga https://github.com/ravjanga are you planning to work on this
feature/bugfix ? We are hitting the same issue as we run docker containers
without the Kubernetes. Monitoring container restarts would be very useful
for us.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/google/cadvisor/issues/2169#issuecomment-601218682,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AGVNO2E7BAJD5ZBC3L2MYNLRIIVSHANCNFSM4GW6GYCA
.

--
Thanks,
Ravikanth ( From GMAIL)

Was this page helpful?
0 / 5 - 0 ratings