// Copyright 2021, Matthew Welland. // // This file is part of Megatest. // // Megatest is free software: you can redistribute it and/or modify // it under the terms of the GNU General Public License as published by // the Free Software Foundation, either version 3 of the License, or // (at your option) any later version. // // Megatest is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // You should have received a copy of the GNU General Public License // along with Megatest. If not, see . Debugging --------- Well Written Tests ~~~~~~~~~~~~~~~~~~ Test Design and Surfacing Errors ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Design your tests to surface errors. Ensure that all logs are processed by logpro (or a custom log processing tool) and can be reached by a mouse click or two from the test control panel. To illustrate, here is a set of scripts with nested calls where script1.sh calls script2.sh which calls script3.sh which finally calls the Cadence EDA tool virtuoso: .script1.sh .............................. #!/bin/bash code ... script2.sh some parameters > script2.log more code ... .............................. .script2.sh .............................. #!/bin/bash code ... script3.sh some more parameters > script3.log more code ... .............................. .script3.sh .............................. #!/bin/bash code ... virtuoso params and switches ... more code ... .............................. The log files script2.log, script3.log and the log output from virtuoso are not accessible from the test control panel. It would be much better for future users of your automation to use steps more fully. One easy option would be to post process the logs in downstream additional steps: .testconfig .............................. [ezsteps] step1 script1.sh step2 cat script2.log step3 cat script3.log [logpro] step1 ;; some logpro rules (expect:required in "LogFileBody" > 0 "Expect this output" #/something expected/) step2 ;; some logpro rules for script2.sh step3 ;; some logpro rules for script3.sh [scripts] script1.sh #!/bin/bash code ... ... .............................. With the above testconfig the logs for every critical part of the automation are fully surfaced and rules can be created to flag errors, warnings, aborts and to ignore false errors. A user of your automation will be able to see the important error with two mouse clicks from the runs view. An even better would be to eliminate the nesting if possible. As a general statement with layers - less is usually more. By flattening the automation into a sequence of steps you can use the test control panel to re-run a step with a single click or from the test xterm run only the errant step from the command line. The message here is make debugging and maintenace easy for future users (and yourself) by keeping clicks-to-error in mind. Examining The Test Logs and Environment ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Test Control Panel - xterm ^^^^^^^^^^^^^^^^^^^^^^^^^^ From the dashboard click on a test PASS/FAIL button. This brings up a test control panel. Aproximately near the center left of the window there is a button "Start Xterm". Push this to get an xterm with the full context and environment loaded for that test. You can run scripts or ezsteps by copying from the testconfig (hint, load up the testconfig in a separate text editor window). With more recent versions of Megatest you can step through your test from the test control panel. Click on the cell labeled "rerun this step" to only rerun the step or click on "restart from here" to rerun that step and downstream steps. NOTE 1: visual feedback can take some time, give it a few seconds and you will see the step change color to blue as it starts running. NOTE 2: steping through only works if you are using ezsteps. A word on Bisecting ~~~~~~~~~~~~~~~~~~~ Bisecting is a debug strategy intended to speed up finding the root cause of some bug. .A complex process with a problem found in stage "E" ["graphviz", "bisecting.png"] ---------------------------------------------------------------------- include::bisecting.dot[] ---------------------------------------------------------------------- It is common to start debugging where the problem was observed and then work back. However by inspecting the output at stage "C" in the example above you would potentially save a lot of debug effort, this is similar to the feature in source control tools like git and fossil called biseceting. Tough Bugs ~~~~~~~~~~ Most bugs in Megatest based automation will be in the scripts called in your test steps and if you utilize the good design practice described above should be fairly easy for you to reproduce, isolate and find. Some bugs however will come from subtle and hard to detect interactions between Megatest and your OS and Unix environment. This includes things like constructed variables that are legal in one context (e.g. tcsh) but illegal in another context (e.g. bash), variables that come from your login scripts and access and permissions issues (e.g. a script that silently fails due to no access to needed data). Other bugs might be due to Megatest itself. To isolate bugs like this you may need to look at the log files at various stages in the execution process of your run and tests. .A simplified diagram of the stages Megatest goes through to run a test. ["graphviz", "megatest-test-stages.png"] ---------------------------------------------------------------------- include::megatest-test-stages.dot[] ---------------------------------------------------------------------- .How to check variable values and inspect logs at each stage [width="80%",cols="<,2m,2m",frame="topbot",options="header"] |====================== |Stage | How to inspect | Watch for |A: post config processing | megatest -show-config -target your/target | #f (failed var processing) |B: post runconfig | megatest -show-runconfig -target your/target | |C: processing testconfigs | inspect output from "megatest -run ..." | Messages indicating issues process configs, dependency problems. |D: process testconfig for test launch | inspect output from megatest runner | Zero items (items expansion yielded no items) |E,F: launching test | start test xterm, look at mt_launch.log | Did your batch system accept the job? Has the job landed on a machine? |G: starting test | look at your batch systems logs for the process | Did the megatest -execute process start and run? |H,H1,H2: step exectution | look at .log, .html and your own internal logs | Do you have sufficiently tight logpro rules? You must always have a "required" rule! |====================== Bisecting megatest.csh/sh ^^^^^^^^^^^^^^^^^^^^^^^^^ Sometimes finding the environment variable that is causing the problem can be very difficult. Bisection can be applied. Edit the megatest.csh or megatest.sh file and comment out 50% per round, source in fresh xterm and run the test. This idea can also be applied to your .cshrc, .bashrc, .aliases and other similar files. csh and -f ^^^^^^^^^^ A common issue when tcsh or csh shells are used for scripting is to forget or choose to not use -f in your #! line. .Not good .............................. #!/bin/tcsh ... .............................. .Good .............................. #!/bin/tcsh -f ... .............................. Config File Processing ^^^^^^^^^^^^^^^^^^^^^^ As described above it is often helpful to know the content of variables in various contexts as Megatest works through the actions needed to run your tests. A handy technique is to force the startup of an xterm in the context being examined. For example, if an item list is not being generated as expected you can inject the startup of an xterm as if it were an item: .Original items table ----------------- [items] CELLNAME [system getcellname.sh] ----------------- .Items table modified for debug ----------------- [items] DEBUG [system xterm] CELLNAME [system getcellnames.sh] ----------------- When this test is run an xterm will pop up. In that xterm the environment is exactly that in which the script "getcellnames.sh" would run. You can now debug the script to find out why it isn't working as expected.