Troubleshootingπ
Topologyπ
Verity displays a live map of the devices in the network and their connections.
Topology Detection and Behaviorπ
- Autodiscovery
- Connection moves
- Remove connections
Reportsπ
Verity provides a comprehensive list of reports to identify current conditions in the network. Reports are viewed by navigating to the Reports window or clicking the Reports icon of a device object
.
MAC Address Workbenchπ
Verity provides the ability to display MAC addresses across the entire system.
The Mac Address Workbench lets you search device by mac address. Clicking the Search button without specifying a mac address displays all devices .
From within a list of devices, clicking opens a new tab to search on the corresponding mac address
.
Mac Exportsπ
Mac Exports provides a summary MAC report for all devices connected to the network's user facing ethernet ports. This tool can be automated to generate a daily report or used on a case-by-case basis as needed. To manually run a report, enable the checkbox for all services you want to include, click Run Report Now, and wait for the Download Latest Report icon to turn blue. Once it does, click the icon to download the CSV file.
Ping and Trace Routeπ
Verity provides tools to troubleshoot Layer 3 issues for the underlay and Tenant connections for services configure for Layer 3. The Ping and Trace Route feature is a diagnostic tool that sends a βpingβ to and from a specified address and details the message path.
Beware of Misconfigurations
This feature is only applicable to systems using GNMI (gRPC Network Management Interface ).
Before you use this feature, ensure the chosen device has its Device Controller/Managed Device Comm Type set to GNMI.
Controlsπ
The Ping and Trace Route dialog box is opened by clicking Open Ping/Traceroute Dialog button .
To use this feature you configure the following parameters and click the Start button .
Sourceπ
- Device: This is the source device of the ping.
- VRF: This is where you select the VRF. The VRFs of Tenants, the Underlay (selected device), and Management (Device Controller) are all options.
- IP: The targeted IP address to send the ping command from. The IP address changes depending on the VRF setting.
Destinationπ
- IP: The targeted IP address to receive the ping command.
Pingπ
- Count: The number of ping events sent before the process is complete.
Trace Routeπ
When this option is selected, the trace route is included in the Result field.
CMD Statusπ
This is a progress indicator. When the feature is ready, CMD Status says Ready; when active, CMD Status says In Progress; and when the process is complete, it says Done.
DHCP Snoopingπ
Verity provides the location of connected L3 addresses detected by DHCP snooping functions in SONiC.
To enable DHCP snooping navigate to Topology/Site Settings and check the box next to Enable DHCP Snooping .
Viewing DHCP Assigned IP'sπ
To view DHCP assigned IP addresses go to Reports/DHCP Assigned IPs .
View Licenseπ
To view the license go to Administration/Licensing
The Licensing object displays License Usage and Physical Port Usage bar graphs. While focused on the object, the user can see the date of licensing expiration, contact information for support, and reports of license.
By clicking on the report icons for License Utilization, License by Device Utilization, and Physical Port Utilization, a report for those lists will be shown. For a specific report on used, preprovisioned, stranded, or spare licenses, the user can click on each section of the bar graph. This can also be done with physical ports that are licensed, fabric or spare.
An example License Utilization report is shown below:
User Manual: Satori Troubleshooting Scriptπ
Overviewπ
This troubleshooting script is a comprehensive diagnostic and maintenance tool for the Verity Satori System. It provides a menu-driven interface to check system status, perform maintenance tasks, and generate debug information for support purposes.
Purposeπ
The script helps administrators:
-
Diagnose Issues: Check service status, connectivity, and configuration
-
Perform Maintenance: Adjust logging levels, restore backups, populate data
-
Generate Support Data: Create comprehensive debug snapshots for technical support
-
Monitor Health: Verify all Satori components are functioning correctly
Prerequisitesπ
System Requirementsπ
- Operating System: Ubuntu 18.04+ with Docker support
- Privileges: Must run as root or with sudo access
- Dependencies:
- Docker and Docker Compose installed and running
- Satori system previously installed via setup script
- Active internet connection for API testing
Before Runningπ
- Ensure Satori is installed in
/be_satori/
- Verify Docker services are accessible
- Have system administrator privileges
- Ensure
/be_install/
directory exists for debug outputs
Running the Scriptπ
Basic Executionπ
sudo bash troubleshooting.sh
Script Interfaceπ
The script presents a clear menu with numbered options and waits for user input.
Menu Options Referenceπ
Option 1: Check Satori Services Statusπ
1. Check if the Satori services are running
Purpose: Verifies all Satori Docker containers are running properly
What it checks:
- ETL (verity-ml-python)
- Grafana
- Prometheus
- Promtail
- Telegraf
- Neo4j-graph
- Loki
- Cadvisor
- Alertmanager
- Node-exporter
- Sensai-service
- Neo4j-vector
Expected Output:
ETL services are running.
Grafana services are running.
Prometheus services are running.
...
--------------------------------------------
Total Services Tested: 12
Currently Running: 12
Not Running: 0
--------------------------------------------
If Services Are Down:
run the command 'sudo docker compose -f /be_satori/docker-compose.yml up -d' and re-run this script
Option 2: Check Time Synchronizationπ
2. Check the status of the time synchronization
Purpose: Verifies system time is properly synchronized
Expected Output:
Local time: Wed 2024-12-18 10:30:15 UTC
Universal time: Wed 2024-12-18 10:30:15 UTC
RTC time: Wed 2024-12-18 10:30:15
Time zone: UTC (UTC, +0000)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Key Indicators:
System clock synchronized: yes
- Time is properly syncedNTP service: active
- Time synchronization service is running
Option 3: Report FQDN Settingπ
3. Report the FQDN setting in the Satori.ini file
Purpose: Shows the configured vNetC host address
Expected Output:
FQDN setting: vnc-satori.company.com
Uses: Verify Satori knows where to connect to vNetC
Option 4: Report MAC Address Settingπ
4. Report the VERITY_SATORI_VM_MAC_ADDRESS setting in the Satori.ini file
Purpose: Shows the configured MAC address for VM identification
Expected Output:
VERITY_MONITORING_VM_MAC_ADDRESS setting: aa:bb:cc:dd:ee:ff
Uses: Verify unique VM identification in the Satori system
Option 5: Report Package Versionπ
5. Report the version of the Satori package
Purpose: Shows the currently installed Satori software version
Expected Output:
Version: 2.1.3
Uses: Verify version for support purposes or update planning
Option 6: Perform Announcement Callπ
6. Perform an announcement call to the FQDN setting
Purpose: Tests Satori's ability to announce itself to vNetC
What it does: Executes the announcement process to register with vNetC
Expected Behavior: Script runs the announcement process and reports success/failure
Uses: Verify connectivity and registration with the central Satori controller
Option 7: Check vNetC API Statusπ
7. Check vNetC API calls are working and report the HTTP status code
Purpose: Tests API connectivity to the vNetC system
Expected Output (Success):
Verity vNetC API calls are working using the FQDN/ROOT_API_URL: vnc-satori.company.com
Expected Output (Failure):
Verity vNetC API calls are not working. FQDN/ROOT_API_URL: vnc-satori.company.com | HTTP status code: 404
HTTP Status Codes:
- 200: Success - API is working properly
- 401: Unauthorized - Check credentials
- 404: Not Found - Check URL/hostname
- 500: Server Error - vNetC system issue
- 000: Connection failed - Network/DNS issue
Option 8: Set Debug Loggingπ
8. Set logging level to DEBUG
Purpose: Enables detailed logging for troubleshooting
Effect: Changes LOG_LEVEL=DEBUG
in satori.ini
When to use: - Investigating specific issues - Before reproducing problems for support - Detailed system analysis
Warning
Debug logging generates large log files and may impact performance
Option 9: Populate Knowledge Graphπ
9. Perform a ETL call to populate the knowledge graph
Purpose: Manually triggers data extraction, transformation, and loading
What it does: Executes the knowledge graph population process
When to use:
- After configuration changes
- To refresh Satori data
- Testing data pipeline functionality
Expected Behavior: Process runs and populates the knowledge graph with current data
Option 10: Set Error Loggingπ
10. Set logging level to ERROR
Purpose: Reduces logging to only error messages
Effect: Changes LOG_LEVEL=ERROR
in satori.ini
When to use: - After debugging is complete - To reduce log file sizes - For normal production operation
Option 11: Create Debug Snapshotπ
11. Create Satori debug snapshot for Verity Support
Purpose: Generates comprehensive diagnostic package for technical support
Prerequisites:
- Must attempt steps 1-9 first
- Specify which step failed
What it includes:
- Complete system status check
- All container logs (latest 10,000 lines)
- Error logs for each container
- System resource information
- Docker container status and stats
- Configuration settings
- Time synchronization status
- API connectivity results
Output Files:
- satori_debug_package_YYYYMMDD_HHMMSS.tar
- Complete debug package
Archive Contents:
satori_service_status_results_YYYYMMDD_HHMMSS.txt
<container_name>_latest.log (for each container)
<container_name>_errors.log (for each container)
system_info.txt
When to use: When you need to provide comprehensive system information to Verity Support
Option 12: Reinstall of latest application packageπ
12. Reinstall of latest application package
Purpose:
To force a reinstallation of the latest version of the Satori application package.
What it does:
- Prompts for confirmation, as the process can take 5-30 minutes.
- Forces a reinstallation of the latest available Satori application package.
When to use:
- If the current application is corrupted.
- To ensure the latest package is installed.
- When directed by Verity Support.
Option 13: Restore Prometheus Volume from backupπ
13. Restore Prometheus Volume from backup
Purpose: Restores the Prometheus Docker volume from a backup.
What it does:
- Prompts the user to select one of the existing Prometheus backups
- Performs a Docker Volume backup based on the selected backup
When to use:
- After data corruption in Prometheus
- To revert Prometheus data to a previous state
- For disaster recovery
Option 14: Exitπ
14. Exit
Purpose: Cleanly exits the troubleshooting script
Option 999: Restore Previous Version β οΈπ
999. Restore the saved previous version of the Satori package (use with caution)
Warning
Use only with support from Verity Support
Purpose: Restores a previous version of the Satori software
Prerequisites: Previous version archive must exist in /be_install/archive/
Process:
- Lists available archived versions
- Requests version number to restore
- Confirms restoration action
- Extracts selected version to
/be_satori/
When to use:
- After failed upgrade
- To revert problematic changes
- Only under technical support guidance
β οΈ Caution: This action cannot be undone
Usage Workflowsπ
Daily Health Checkπ
- Run Option 1 (Check services)
- Run Option 2 (Check time sync)
- Run Option 7 (Check API)
Troubleshooting Workflowπ
- Set debug logging (Option 8)
- Check all services (Option 1)
- Verify connectivity (Options 6, 7)
- Perform ETL test (Option 9)
- Create debug snapshot (Option 11)
- Reset logging level (Option 10)
Pre-Support Contactπ
- Complete steps 1-9 in order
- Note which step fails
- Create debug snapshot (Option 11)
- Provide the generated tar file to support
Expected Output Examplesπ
Healthy System Outputπ
Select:
1. Check if the Satori services are running
...
Enter your choice: 1
ETL services are running.
Grafana services are running.
Prometheus services are running.
Promtail services are running.
Telegraf services are running.
Neo4j-graph services are running.
Loki services are running.
Cadvisor services are running.
Alertmanager services are running.
Node-exporter services are running.
Sensai-service services are running.
Neo4j-vector services are running.
--------------------------------------------
Total Services Tested: 12
Currently Running: 12
Not Running: 0
--------------------------------------------
Press Enter to continue...
Problem System Outputπ
ETL services are running.
Grafana services is not running.
Prometheus services are running.
...
--------------------------------------------
Total Services Tested: 12
Currently Running: 11
Not Running: 1
--------------------------------------------
run the command 'sudo docker compose -f /be_satori/docker-compose.yml up -d' and re-run this script
Troubleshooting Common Issuesπ
No Docker Containers Runningπ
Symptoms: All services show "not running"
Solution:
sudo docker compose -f /be_satori/docker-compose.yml up -d
API Calls Failing (HTTP 401)π
Symptoms: Option 7 shows HTTP status code 401
Cause: Invalid API credentials
Solution: Check /be_satori/.env
file for correct credentials
API Calls Failing (HTTP 000)π
Symptoms: Option 7 shows HTTP status code 000
Cause: Network connectivity issue
Solution:
1. Check Option 3 for correct FQDN
2. Verify network connectivity: ping <fqdn>
3. Check DNS resolution: nslookup <fqdn>
Time Synchronization Issuesπ
Symptoms: Option 2 shows System clock synchronized: no
Solution:
sudo systemctl restart systemd-timesyncd
sudo timedatectl set-ntp true
Debug Snapshot Creation Failsπ
Symptoms: Option 11 fails to create archive
Cause: Insufficient disk space or permissions
Solution:
# Check disk space
df -h
# Ensure proper permissions
sudo chown -R $USER:$USER ./
Log File Locationsπ
Script Outputsπ
- Debug snapshots:
./satori_debug_package_YYYYMMDD_HHMMSS.tar
- Container logs:
./satori_logs_snapshot/
System Logsπ
- Docker containers:
sudo docker logs <container_name>
- System logs:
/var/log/syslog
- Monitoring logs:
/be_satori/logs/
(if exists)
Best Practicesπ
Regular Maintenanceπ
- Run daily health checks (Options 1, 2, 7)
- Monitor log levels (keep at ERROR for production)
- Perform periodic ETL operations (Option 9)
Before Support Contactπ
- Always run the complete diagnostic sequence (1-9)
- Create debug snapshot (Option 11)
- Note exact error messages and timing
- Include system information and recent changes
Safety Guidelinesπ
- Never use Option 999 without support guidance
- Test changes in non-production first
- Always create debug snapshots before major changes
- Keep debug logging temporary (return to ERROR level)
Recovery Proceduresπ
Complete System Recoveryπ
If multiple services are down:
# Stop all containers
sudo docker compose -f /be_satori/docker-compose.yml down
# Start all containers
sudo docker compose -f /be_satori/docker-compose.yml up -d
# Wait 2 minutes, then check status
sleep 120
# Run troubleshooting script Option 1
Data Recoveryπ
If data appears corrupted:
- Use Option 13 to restore the Prometheus volume from backup.
- Use Option 999 to restore previous version (with support)
- Re-run setup script if necessary
Support Informationπ
When to Contact Supportπ
- Multiple services consistently failing
- API connectivity issues persist after basic troubleshooting
- Data corruption or loss
- Before using Option 999
Information to Provideπ
- Debug snapshot tar file from Option 11
- Specific error messages
- Recent system changes
- Output from Options 1, 2, 3, 4, 5, 7
Self-Help Resourcesπ
- Check Docker container logs:
sudo docker logs <container_name>
- Verify system resources:
free -h
anddf -h
- Check network connectivity to vNetC host
- Review satori.ini configuration file
This troubleshooting script is designed to be comprehensive and self-contained, providing both diagnostic capabilities and maintenance functions for Satori.