Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Builder collectd-60-solaris10-sparc Build #28

Results:

Build successful

SourceStamp:

Projectcollectd/collectd
Repositoryhttps://github.com/collectd/collectd
Branchcollectd-6.0
Revisionb18aed6930d9ce91534c41ce90453588ec006950
Got Revisionb18aed6930d9ce91534c41ce90453588ec006950
Changes10 changes

BuildSlave:

unstable10s

Reason:

The AnyBranchScheduler scheduler named 'schedule-collectd-60' triggered this build

Steps and Logfiles:

  1. git update ( 16 secs )
    1. stdio
  2. setproperty property 'ciflags' set ( 0 secs )
    1. stdio
    2. property changes
  3. shell '/opt/csw/bin/bash ./build.sh' ( 8 mins, 45 secs )
    1. stdio
  4. shell_1 './configure --prefix=/opt/csw ...' ( 4 mins, 30 secs )
    1. stdio
    2. config.log
  5. shell_2 'gmake -k ...' ( 13 mins, 26 secs )
    1. stdio
  6. shell_3 'gmake check' ( 2 mins, 6 secs )
    1. stdio
    2. test-suite.log

Build Properties:

NameValueSource
branch collectd-6.0 Build
builddir /export/home/buildbot-unstable10s/slave/collectd-60-solaris10-sparc slave
buildername collectd-60-solaris10-sparc Builder
buildnumber 28 Build
ciflags --disable-aggregation --disable-check_uptime --disable-csv --disable-java --disable-lua --disable-match_empty_counter --disable-match_hashed --disable-match_regex --disable-match_timediff --disable-match_value --disable-network --disable-perl --disable-postgresql --disable-target_notification --disable-target_replace --disable-target_scale --disable-target_set --disable-target_v5upgrade --disable-threshold --disable-write_graphite --disable-write_kafka --disable-write_mongodb --disable-write_pro .. [property value too long] SetPropertyFromCommand Step
codebase Build
got_revision b18aed6930d9ce91534c41ce90453588ec006950 Git
project collectd/collectd Build
repository https://github.com/collectd/collectd Build
revision b18aed6930d9ce91534c41ce90453588ec006950 Build
scheduler schedule-collectd-60 Scheduler
slavename unstable10s BuildSlave
workdir /export/home/buildbot-unstable10s/slave/collectd-60-solaris10-sparc slave (deprecated)

Forced Build Properties:

NameLabelValue

Responsible Users:

  1. Eero Tamminen

Timing:

StartWed Feb 1 07:56:38 2023
EndWed Feb 1 08:25:45 2023
Elapsed29 mins, 6 secs

All Changes:

:

  1. Change #167826

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision 2179066f0bc207d24c833324fd9aa286df09e8b9

    Comments

    gpu_sysman: Add "pci_dev" label
    
    On large cluster with different types of GPUs, it helps knowing which
    card is of which type, not just their metrics. "pci_dev" label adds
    PCI device ID to the device metrics.
    
    Because GPUs within each cluster node are normally supposed to be
    identical i.e. differ only between nodes, and additional labels
    increase processing load, this is enabled only with the GpuInfo
    setting.
    
    Getting additional strings out of gpu_info() function required
    refactoring.  GPU index in errors is now output only by gpu_scan(),
    and gpu_info() gets pointers to label string pointers instead.

    Changed files

    • src/collectd.conf.pod
    • src/gpu_sysman.c
  2. Change #167827

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision 22aad76dbe5f13bbf71722b9d744b330dae894f6

    Comments

    gpu_sysman: Add memory "health" label if memory health is known
    
    Already in L0 spec v1.0.
    
    Included only to memory usage metrics which are already querying
    memory state (unlike memory BW metrics).

    Changed files

    • src/gpu_sysman.c
  3. Change #167828

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision 42b9044e1f7baf78c39378237b253b2e9a70aced

    Comments

    gpu_sysman: Provide returned error code when logging Sysman failures
    
    To help in debugging issues with Sysman API usage.
    
    (Includes minor stylistic improvements from Ukri & Tuomas)
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
  4. Change #167829

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision 543147761e99026df30cf2f5ccee71e4787bfab1

    Comments

    gpu_sysman: add "throttled_by" label to frequency metric
    
    Which is empty/missing when frequency is not throttled.
    
    Already in L0 spec v1.0.
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
    • src/gpu_sysman_test.c
  5. Change #167830

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision eefd3e98fb74d703f026f3d9c93e10154e4bb3ef

    Comments

    gpu_sysman: Fix memory metric comments
    
    Caught by Ukri.
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
  6. Change #167831

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision cd68f6f24f5515eb8feafa73935a927ff0baed7e

    Comments

    gpu_sysman: Minor improvements to test code
    
    Decrease max value and increase how many decimals are shown for metric
    values, so that tests verbose logging shows useful values also for
    ratios (which are in 0-1 range).
    
    Rest of changes improve 'gpu_sysman.c' test coverage by 1%.
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman_test.c
  7. Change #167832

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision b2d5aad2ab4862204710e5c813a341c38b3c76f9

    Comments

    gpu_sysman: make freq & mem handling more consistent
    
    Readability/consistency improvement: change frequency and memory
    metric handling to use new "reported" boolean instead of cache index,
    for checking when metrics need to be submitted.  This is more
    consistent how other metric functions handle that.
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
  8. Change #167833

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision ccbd648d12c0e0724b7581459e5426d7f9486bcb

    Comments

    gpu_sysman: do metric reset on every loop round
    
    Not doing metric reset between loop rounds could result in extra
    incorrect metric label being reported for a metric, when earlier
    metric in the loop had a conditional label, but latter metric does not
    satisfy that condition (Sysman call for the info failed, but fail is
    ignored, or Sysman struct value used for given label is not set).
    
    This can happen e.g. with the conditional memory "health", frequency
    "throttled_by" and power "limit" labels.
    
    Other alternative would be either setting or removing (= using NULL)
    values for each of the possible labels on every round.  Just reseting
    metric labels on every round seemed more robust (easier to review),
    and allowed simplifying the code slightly.
    
    Looking at collectd metric implementation, it causes more allocs /
    deallocs for the label array & label names though.
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
  9. Change #167834

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision 55a9296a0ec1b4d8d1605387f20a1304b25baa32

    Comments

    gpu_sysman: improve power limit handling
    
    Limits can be reported to only a subset of power domains. Therefore
    querying limits (for given GPU) should be disabled only when querying
    fails for all domains.
    
    Added also TODO for upcoming spec change I noticed in the spec tracker.
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
  10. Change #167835

    Category None
    Changed by Eero Tamminen <eero.t.tamminenohnoyoudont@intel.com>
    Changed at Wed 01 Feb 2023 07:55:27
    Repository https://github.com/collectd/collectd
    Project collectd/collectd
    Branch collectd-6.0
    Revision b18aed6930d9ce91534c41ce90453588ec006950

    Comments

    gpu_sysman: initialize struct .pNext members before use
    
    Next Sysman spec will explictly state that they need be initialized:
    https://github.com/oneapi-src/level-zero-spec/commit/98dfaaf041dedfd8c9bcf9a3957f334836e859e4
    
    And latest Sysman backend versions corrupt memory / crash unless .pNext
    values in some of the structs given to Get functions are initialized.
    
    (Releases before fall 2022 did not use .pNext values in get* calls,
    and worked fine. It just took a long time until I was able to verify
    whether this was a regression that will be fixed, or intended change.)
    
    Additionally, validate in test code that .pNext values are set to NULL
    (because some structs lack those pointer members, ADD_METRIC() macro
    cannot do that check for the <statename> functions given for it, but
    otherwise everything is covered).
    
    Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

    Changed files

    • src/gpu_sysman.c
    • src/gpu_sysman_test.c