Index: db.scm ================================================================== --- db.scm +++ db.scm @@ -613,11 +613,11 @@ exn (begin (debug:print 0 *default-log-port* "EXCEPTION: database probably overloaded or unreadable in db:sync-tables.") (print-call-chain (current-error-port)) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (print "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (debug:print 0 *default-log-port* " status: " ((condition-property-accessor 'sqlite3 'status) exn)) (debug:print 0 *default-log-port* " src db: " (db:dbdat-get-path fromdb)) (for-each (lambda (dbdat) (let ((dbpath (db:dbdat-get-path dbdat))) (debug:print 0 *default-log-port* " dbpath: " dbpath) @@ -1038,11 +1038,11 @@ ((busy) (thread-sleep! sleep-time)) (else (debug:print 0 *default-log-port* "EXCEPTION: database probably overloaded or unreadable.") (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (print "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (debug:print 0 *default-log-port* " status: " ((condition-property-accessor 'sqlite3 'status) exn)) (print-call-chain (current-error-port)) (thread-sleep! sleep-time) (debug:print-info 0 *default-log-port* "trying db call one more time....this may never recover, if necessary kill process " (current-process-id) " on host " (get-host-name) " to clean up"))) (apply open-run-close-exception-handling proc idb params)) Index: docs/manual/megatest_manual.html ================================================================== --- docs/manual/megatest_manual.html +++ docs/manual/megatest_manual.html @@ -801,18 +801,17 @@

Megatest Design Philosophy

-

Megatest is intended to provide the minimum needed resources to make -writing a suite of tests and tasks for implementing continuous build -for software, design engineering or process control (via owlfs for -example) without being specialized for any specific problem -space. Megatest in of itself does not know what constitutes a PASS or -FAIL of a test or task. In most cases megatest is best used in -conjunction with logpro or a similar tool to parse, analyze and decide -on the test outcome.

+

Megatest is a distributed system intended to provide the minimum needed +resources to make writing a suite of tests and tasks for implementing +continuous build for software, design engineering or process control (via +owlfs for example) without being specialized for any specific problem +space. Megatest in of itself does not know what constitutes a PASS or FAIL +of a test or task. In most cases megatest is best used in conjunction with +logpro or a similar tool to parse, analyze and decide on the test outcome.

  • Self-checking -Repeatable strive for directed or self-checking test as opposed to delta based tests @@ -876,24 +875,24 @@

    Goals

    1. Reduce load on the file system. Sqlite3 files on network filesystem can be - a burden. + a burden. [DONE]

    2. Reduce number of servers and frequency of start/stop. This is mostly an - issue of clutter but also a reduction in "moving parts". + issue of clutter but also a reduction in "moving parts". [DONE]

    3. Coalesce activities to a single home host where possible. Give the user feedback that they have started the dashboard on a host other than the - home host. + home host. [DONE]

    4. Reduce number of processes involved in managing running tests. @@ -905,35 +904,35 @@

      Changes Needed

      1. ACID compliant db will be on /tmp and synced to megatest.db with a five - second max delay. + second max delay. [DONE]

      2. Read/writes to db for processes on homehost will go direct to /tmp - megatest.db file. + megatest.db file. [DONE]

      3. Read/wites fron non-homehost processes will go through one server. Bulk reads (e.g. for dashboard or list-runs) will be cached on the current host - in /tmp and synced from the home megatest.db in the testsuite area. + in /tmp and synced from the home megatest.db in the testsuite area. [DONE]

      4. -Db syncs rely on the target db file timestame minus some margin. +Db syncs rely on the target db file timestame minus some margin. [DONE]

      5. Since bulk reads do not use the server we can switch to simple RPC for the - network transport. + network transport. [DONE]

      6. Test running manager process extended to manage multiple running tests. @@ -947,31 +946,32 @@

        ww05 - migrate to inmem-db

        1. -Switch to inmem db with fast sync to on disk db’s [DONE] +Switch to inmem db with fast sync to on disk db’s [DONE]

        2. Server polls tasks table for next action

          1. -Task table used for tracking runner process [DONE] +Task table used for tracking runner process [Replaced by mtutil]

          2. -Task table used for jobs to run +Task table used for jobs to run [Replaced by mtutil]

          3. -Task table used for queueing runner actions (remove runs, cleanRunExecute, etc) +Task table used for queueing runner actions (remove runs, + cleanRunExecute, etc) [Replaced by mtutil]

        @@ -2151,10 +2151,10 @@

        Index: docs/manual/megatest_manual.txt ================================================================== --- docs/manual/megatest_manual.txt +++ docs/manual/megatest_manual.txt @@ -24,18 +24,17 @@ qualification. Megatest Design Philosophy -------------------------- -Megatest is intended to provide the minimum needed resources to make -writing a suite of tests and tasks for implementing continuous build -for software, design engineering or process control (via owlfs for -example) without being specialized for any specific problem -space. Megatest in of itself does not know what constitutes a PASS or -FAIL of a test or task. In most cases megatest is best used in -conjunction with logpro or a similar tool to parse, analyze and decide -on the test outcome. +Megatest is a distributed system intended to provide the minimum needed +resources to make writing a suite of tests and tasks for implementing +continuous build for software, design engineering or process control (via +owlfs for example) without being specialized for any specific problem +space. Megatest in of itself does not know what constitutes a PASS or FAIL +of a test or task. In most cases megatest is best used in conjunction with +logpro or a similar tool to parse, analyze and decide on the test outcome. * Self-checking -Repeatable strive for directed or self-checking test as opposed to delta based tests * Traceable - environment variables, host OS and other possibly influential Index: docs/plan.txt ================================================================== --- docs/plan.txt +++ docs/plan.txt @@ -8,44 +8,45 @@ Goals ^^^^^ . Reduce load on the file system. Sqlite3 files on network filesystem can be - a burden. + a burden. [green]#[DONE]# . Reduce number of servers and frequency of start/stop. This is mostly an - issue of clutter but also a reduction in "moving parts". + issue of clutter but also a reduction in "moving parts". [green]#[DONE]# . Coalesce activities to a single home host where possible. Give the user feedback that they have started the dashboard on a host other than the - home host. + home host. [green]#[DONE]# . Reduce number of processes involved in managing running tests. Changes Needed ^^^^^^^^^^^^^^ . ACID compliant db will be on /tmp and synced to megatest.db with a five - second max delay. + second max delay. [green]#[DONE]# . Read/writes to db for processes on homehost will go direct to /tmp - megatest.db file. + megatest.db file. [green]#[DONE]# . Read/wites fron non-homehost processes will go through one server. Bulk reads (e.g. for dashboard or list-runs) will be cached on the current host - in /tmp and synced from the home megatest.db in the testsuite area. -. Db syncs rely on the target db file timestame minus some margin. + in /tmp and synced from the home megatest.db in the testsuite area. [green]#[DONE]# +. Db syncs rely on the target db file timestame minus some margin. [green]#[DONE]# . Since bulk reads do not use the server we can switch to simple RPC for the - network transport. + network transport. [green]#[DONE]# . Test running manager process extended to manage multiple running tests. Current Items ~~~~~~~~~~~~~ ww05 - migrate to inmem-db ^^^^^^^^^^^^^^^^^^^^^^^^^^ -. Switch to inmem db with fast sync to on disk db's [DONE] +. Switch to inmem db with fast sync to on disk db's [green]#[DONE]# . Server polls tasks table for next action -.. Task table used for tracking runner process [DONE] -.. Task table used for jobs to run -.. Task table used for queueing runner actions (remove runs, cleanRunExecute, etc) +.. Task table used for tracking runner process [red]#[Replaced by mtutil]# +.. Task table used for jobs to run [red]#[Replaced by mtutil]# +.. Task table used for queueing runner actions (remove runs, + cleanRunExecute, etc) [red]#[Replaced by mtutil#] // ww32 // ~~~~ // Index: http-transport.scm ================================================================== --- http-transport.scm +++ http-transport.scm @@ -122,11 +122,11 @@ (print-error-message exn) (if (< portnum 64000) (begin (debug:print 0 *default-log-port* "WARNING: attempt to start server failed. Trying again ...") (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (portlogger:open-run-close portlogger:set-failed portnum) (debug:print 0 *default-log-port* "WARNING: failed to start on portnum: " portnum ", trying next port") (thread-sleep! 0.1) ;; get_next_port goes here Index: megatest.scm ================================================================== --- megatest.scm +++ megatest.scm @@ -1178,11 +1178,11 @@ (lambda (test) (common:debug-handle-exceptions #f exn (begin (debug:print-error 0 *default-log-port* "Bad data in test record? " test) - (print "exn=" (condition->list exn)) + (debug:print-error 5 *default-log-port* "exn=" (condition->list exn)) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) (print-call-chain (current-error-port))) (let* ((test-id (if (member "id" tests-spec)(get-value-by-fieldname test test-field-index "id" ) #f)) ;; (db:test-get-id test)) (testname (if (member "testname" tests-spec)(get-value-by-fieldname test test-field-index "testname" ) #f)) ;; (db:test-get-testname test)) (itempath (if (member "item_path" tests-spec)(get-value-by-fieldname test test-field-index "item_path" ) #f)) ;; (db:test-get-item-path test)) Index: portlogger.scm ================================================================== --- portlogger.scm +++ portlogger.scm @@ -56,11 +56,11 @@ exn (begin ;; (release-dot-lock fname) (debug:print-error 0 *default-log-port* "portlogger:open-run-close failed. " proc " " params) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (if (file-exists? fname)(delete-file fname)) ;; brutally get rid of it (print-call-chain (current-error-port))) (let* (;; (lock (obtain-dot-lock fname 2 9 10)) (db (portlogger:open-db fname)) (res (apply proc db params))) @@ -103,11 +103,11 @@ (handle-exceptions exn (begin (debug:print 0 *default-log-port* "EXCEPTION: portlogger database probably overloaded or unreadable. If you see this message again remove /tmp/.$USER-portlogger.db") (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (print-call-chain (current-error-port)) (debug:print 0 *default-log-port* "Continuing anyway.") #f) (sqlite3:fold-row (lambda (var curr) @@ -128,11 +128,11 @@ (handle-exceptions exn (begin (debug:print 0 *default-log-port* "EXCEPTION: portlogger database probably overloaded or unreadable. If you see this message again remove /tmp/.$USER-portlogger.db") (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (print-call-chain (current-error-port)) (debug:print 0 *default-log-port* "Continuing anyway.")) (portlogger:take-port db portnum)) portnum)) @@ -158,11 +158,11 @@ (handle-exceptions exn (begin (debug:print 0 *default-log-port* "EXCEPTION: portlogger database at " dbfname " probably overloaded or unreadable. Try removing it.") (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (print "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (debug:print 0 *default-log-port* " status: " ((condition-property-accessor 'sqlite3 'status) exn)) (print-call-chain (current-error-port)) #f) (case (string->symbol (car args)) ;; commands with two or more params ((take)(portlogger:take-port db (string->number (cadr args)))) Index: process.scm ================================================================== --- process.scm +++ process.scm @@ -53,11 +53,11 @@ (handle-exceptions exn (begin (print "ERROR: Failed to run command: " cmd " " (string-intersperse params " ")) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (print "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) #f) (let-values (((fh fho pid) (if (null? params) (process cmd) (process cmd params)))) (let loop ((curr (read-line fh)) Index: tasks.scm ================================================================== --- tasks.scm +++ tasks.scm @@ -36,11 +36,11 @@ (handle-exceptions exn (begin (print-call-chain (current-error-port)) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* " exn=" (condition->list exn)) + (debug:print 5 *default-log-port* " exn=" (condition->list exn)) (debug:print 0 *default-log-port* "tasks:wait-on-journal failed. Continuing on, you can ignore this call-chain") #t) ;; if stuff goes wrong just allow it to move on (let loop ((journal-exists (file-exists? fullpath)) (count n)) ;; wait ten times ... (if journal-exists @@ -87,17 +87,17 @@ exn (if (> numretries 0) (begin (print-call-chain (current-error-port)) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* " exn=" (condition->list exn)) + (debug:print 5 *default-log-port* " exn=" (condition->list exn)) (thread-sleep! 1) (tasks:open-db numretries (- numretries 1))) (begin (print-call-chain (current-error-port)) (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (debug:print 0 *default-log-port* " exn=" (condition->list exn)))) + (debug:print 5 *default-log-port* " exn=" (condition->list exn)))) (let* ((dbpath (db:dbfile-path )) ;; (tasks:get-task-db-path)) (dbfile (conc dbpath "/monitor.db")) (avail (tasks:wait-on-journal dbpath 10)) ;; wait up to about 10 seconds for the journal to go away (exists (file-exists? dbpath)) (write-access (file-write-access? dbpath)) Index: tests.scm ================================================================== --- tests.scm +++ tests.scm @@ -1560,11 +1560,11 @@ (tests:set-full-meta-info db test-id run-id minutes work-area (- remtries 1))) (let ((err-status ((condition-property-accessor 'sqlite3 'status #f) exn))) (debug:print-error 0 *default-log-port* "tried for over a minute to update meta info and failed. Giving up") (debug:print 0 *default-log-port* "EXCEPTION: database probably overloaded or unreadable.") (debug:print 0 *default-log-port* " message: " ((condition-property-accessor 'exn 'message) exn)) - (print "exn=" (condition->list exn)) + (debug:print 5 *default-log-port* "exn=" (condition->list exn)) (debug:print 0 *default-log-port* " status: " ((condition-property-accessor 'sqlite3 'status) exn)) (print-call-chain (current-error-port)))) (tests:update-testdat-meta-info db test-id work-area cpuload diskfree minutes) )))