How to measure an object size accurate

dannypsnl · August 5, 2022, 8:41am

I make a draft macro called size which takes an object and measures the memory use:

(define-syntax (size stx)
  (syntax-parse stx
    [(_ obj)
     #'(let ([before-use (current-memory-use)])
         obj
         (define after-use (current-memory-use))
         (printf "size: ~a bytes~n" (- after-use before-use)))]))

(size (thread (λ () (let loop () (thread-receive) (loop)))))
(size 1)
(size 2)

Above program prints:

size: 8672 bytes
size: 32 bytes
size: 32 bytes

Does there any way to make it more accurate?

jbclements · August 5, 2022, 4:13pm

First naive suggestion: call (collect-garbage) three times before measuring memory use. (The 3x may be a relic from the deep past, no idea if this is still the right heuristic.)

jbclements · August 5, 2022, 4:15pm

Ooh... except that if the value doesn't survive, that call could actually collect it. So I would return it after checking the memory use, to make sure it doesn't die. Even then, you run the risk of optimizers cleverly discovering that it's not actually needed.

jbclements · August 5, 2022, 4:16pm

but if you bind it to a top-level identifier, I would imagine that it definitely wouldn't get collected.

gus-massa · August 5, 2022, 6:19pm

With 8.2 CS, in DrRacket I got

size: 20112 bytes
size: 0 bytes
size: 0 bytes

My guess is 32 is the size of the temporal variables,. fixnums like 1 or 2 don't allocate memory neither in CS or BC.

I made another version with void/reference-sink because otherwise the compiler may notice that it's unused and just avoid allocating it. (And there are a few more optimization that may remove the reference before you expect it.)

For example try

 (size (list 0 1 2 3 4 5 6 7 8 9))

Also, I added (sleep 1). I never heard it's necessary but trying the program a few times I get more consistent results with a small pause (???).

With my version I get size that is like x10 the number I get with your version.

size: 182768 bytes
size: 0 bytes
size: 0 bytes

I'm not sure it's correct, but my version is:

#lang racket
(require (for-syntax syntax/parse))
(require ffi/unsafe)

(define-syntax (size stx)
  (syntax-parse stx
    [(_ obj)
     #'(let ()
         (sleep 1)
         (collect-garbage)
         (collect-garbage)
         (collect-garbage)
         (define before-use (current-memory-use))
         (define thing obj)
         (define after-use (current-memory-use))
         (void/reference-sink thing)
         (printf "size: ~a bytes~n" (- after-use before-use)))]))

(size (thread (λ () (let loop () (thread-receive) (loop)))))
(size 1)
(size 2)

dannypsnl · August 6, 2022, 3:38am

This sounds make sense, what I actually would like to know the representation size in machine of any value, but seems it only count the allocation

jbclements · August 7, 2022, 12:44pm

You're pointing out that this doesn't measure the size of other existing values that are linked to from this one? So, for instance, in

(define a (list 3 4 5 6))
(define b (cons 44 a))

You'd like the size of 'b' to include the size of 'a'?

I don't see an easy way to compute this. I'm also not sure how useful it is, because it may include large values that are not collectible.

I guess I should ask: how would you like to use this information?

dannypsnl · August 7, 2022, 3:48pm

My original scenario is I have many long-live threads, they will exchange messages to complete different jobs, and each has its own local storage. I want to measure the accurate memory usage so that I can know the limitation of the design, to know whether I should add something into the thread or I should put them somewhere else.

bogdan · August 7, 2022, 7:00pm

You may be able to use current-memory-use with a custodian in that case. For example:

#lang racket/base

(define cust (make-custodian))
(define thd
  (parameterize ([current-custodian cust])
    (thread
     (lambda ()
       (let loop ([data null])
         (loop (cons (thread-receive) data)))))))

(collect-garbage)
(define old (current-memory-use cust))
(define (display-delta)
  (collect-garbage)
  (define new (current-memory-use cust))
  (define delta (- new old))
  (set! old new)
  (printf "delta: ~a~a old: ~a new: ~a~n" (if (> delta 0) "+" "") delta old new))

(display-delta)

(thread-send thd (vector 1 2 3))
(display-delta)
(thread-send thd (make-bytes 10240))
(display-delta)
(thread-send thd (make-bytes 10240))
(display-delta)

(define unreachable (make-bytes 10240))
(display-delta)

(thread-send thd (make-bytes 10240))
(display-delta)

The above prints

delta: -32 old: 3472 new: 3472
delta: +104 old: 3576 new: 3576
delta: +10272 old: 13848 new: 13848
delta: +10272 old: 24120 new: 24120
delta: 0 old: 24120 new: 24120
delta: +10272 old: 34392 new: 34392

on my machine with a recent build of Racket CS.

Topic		Replies	Views
Debugging memory leaks Questions & Answers	4	486	May 14, 2022
Compile time for unused functions might take very long time Questions & Answers question	2	56	July 10, 2024
MB at the right in the status line of DrRacket Questions & Answers	2	175	May 13, 2023
Memory overflow Questions & Answers drracket	5	121	October 19, 2024
Huge values of exact numbers Questions & Answers	1	256	March 17, 2022

How to measure an object size accurate

Related topics