1. Home
  2. User Guide
  3. Alerts/Metrics List

Alerts/Metrics List

Performance Module

Following is the list of metrics in the AimBetter Platform Performance Module in each Tab.

-Windows

— Hosts

MetricDescriptionInvestigate this alert
CPU UsageThe overall percentage of time the CPU spends executing non-idle tasks. High CPU utilization may indicate that the CPU is under heavy load and could be a bottleneck in system performance.Read more
CPU Queue LengthThe number of processes waiting to be executed by the CPU in a system at a given time. Can provide insights into system performance and potential bottlenecks.
CPU Hardware InterruptsThese interrupts are signals that occur when a hardware device requires servicing or when a specific event or condition, such as a timer or a peripheral input, needs to be handled by the CPU.Read more
CPU DPC InterruptsAlso known as Deferred Procedure Call interrupts, device drivers generate this type of hardware interrupt (in Windows operating systems) to request deferred execution of time-sensitive tasks that cannot be performed immediately.
Total memoryThe total amount of RAM (Random Access Memory) in GB. It is a volatile memory that provides temporary storage for data and instructions that the CPU needs to access quickly.Read more
Memory freeThe amount of system memory (RAM) in GB that is currently not used by any active processes or applications. It represents the memory portion readily available for immediate allocation and usage by the operating system or any newly launched programs.Read more
Memory free % (percentage)The percentage level of the system’s free memory in comparison to the total memory.Read more
Disk UsageIndicates how much of the total disk capacity is currently being utilized by various files and system components in GB.Read more
Disk BusyReflects the proportion of time (percentage) the disk is actively engaged in performing read or write operations compared to the total available time.Read more
Last RestartLast system restart.
Ping Lost Packets (0-12)The amount of unsuccessful communication integrity checks out of 12 attempts. As default, ping is sent to google.com. This can be changed on the AimBetter configuration.Read more
Network JitterThe variation in milliseconds in the delay of packet delivery during all 12 communication integrity checks. It is a measure of the variability or inconsistency in the timing of data packets as they travel from the source to the destination.Read more
Network LatencyThe time taken in milliseconds for a packet to travel from its source to its destination, including the time spent in transit and any processing delays along the way.Read more
Internet LatencyThe delay or latency in milliseconds experienced when data packets travel between a user’s device and a remote server or destination over the internet.Read more
Internet JitterThe variation in latency or delay experienced by data packets as they travel over the internet during all 12 communication integrity checks.Read more
UptimeThe duration of time that a server has been continuously running without experiencing a restart or shutdown.
OSThe operating system version name.
SPThe operating system updated version (service pack).
CPU CoresThe number of individual processing units within a central processing unit (CPU). Each CPU core can execute instructions and perform calculations independently of other cores.Read more
Memory Page ReadThe time for retrieving data from a page of memory into the processor’s cache. Memory page reads are essential for efficient memory access and are typically performed automatically by the hardware and operating system. To minimize the number of pages reads, frequently accessed data should be kept in the processor’s cache.
Paging UsedThe amount of Pagefile usage, which might be reffered as virtual memory also, found on the computer Disk storage that serves to aid the system’s physical memory (RAM) when there is a need for additional memory to operate processes.Read more
Total Disk IOThe amount of reading (output) and writing (input) from the system Disk storage, measured as the number of IO operations per second (IOPS).

— Network

Metric

DescriptionInvestigate this alert
Card NameThe sampled network card name
BandwidthThe amount of data that can be transmitted over a network in gigabits per second (Gbps). Bandwidth relates to the speed at which data can be transmitted between devices. A higher bandwidth allows for faster and more extensive data.
There are situations where the card configuration is set in a sub-optimal way. For example, the card supports 1Gbps but is set at 100Mbps.
Network
utilization
The percentage of available network bandwidth that is being used by data traffic at a given time. A high percentage indicates an extensive transfer of data. This will cause data transfer slowness between different programs and systems throughout the network.Read more
Receive
Kbyte (sec)
The amount of data received through the network card in Kbps (KB per second). High values indicate that the server is receiving large amounts of data which can be the cause of system slowness.
Send
Kbyte (sec)
The amount of data sent through the network card in Kbps (KB per second). High values indicate that the server is sending large amounts of data which can be the cause of system slowness.
ModelThe name and model number of a specific network interface card (NIC).

— Disk

MetricDescriptionInvestigate this alert
DriveThe drive name.
TotalThe total disk storage capacity (in GB).
Disk
Usage (GB)
The disk storage usage in GB. Usage higher than 95% of the storage space can lead to loss of information and the integrity of programs and processes in the system.Read more
Free SpaceThe free disk storage space in GB. Low free storage space can lead to loss of information and the integrity of the programs and processes in the system.Read more
Busy TimeIndicates the percentage of time the disk is actively handling I/O operations (Read/ Write) compared to its idle or idle-like state. A high value of usage can cause system slowness.Read more
Write /R (ms)The time taken to write to the disk in milliseconds. Writing time higher than one millisecond indicates a load on the disk or a lack of integrity.Read more
Read /R (ms)The time taken to read to the disk in milliseconds. Reading time higher than one millisecond indicates a load on the disk or a lack of integrity.Read more
IO (sec)The amount of data that is readen/written from/to the disk per second. If the amount of reading and writing is high, the system will respond slowly.Read more
IO Write(sec)The amount of writing to the disk per second. If the writing amount is high, the system will respond slowly.Read more
IO Read(sec)The amount of reading from the disk per second. If the amount of reading is high, the system response may be slow.Read more
Disk Free (%)The percentage of the free disk storage space in GB. A low percentage indicates that disk storage capacity almost reached its limit. It’s recommended to look for the processes or files that consume most of the storage.Read more
Est. Max IOEstimated maximum I/O operations rate that the disk can reach. This parameter is important to understand whether the system can handle the expected workload and identify potential performance limitations or the need for additional disk resources.Read more

— CPU

MetricDescriptionInvestigate this alert
Core UsageThe percentage of time an individual core spends executing non-idle tasks.Read more
Core No.The core number.
Core Hardware
interrupts
Indicates the signals a specific core receives from an OS external device. A high value may indicate that there are processes that can be the cause of slowness in the OS.Read more
Core DPC interruptsIndicates the DPC signals a specific core receives.
DPC= deferred procedure calls are interrupts that are run at a lower priority than standard interrupts.
A high value indicates that there may be a processor bottleneck or an application or hardware-related issue that can significantly diminish overall system performance.

— Paging

MetricDescriptionInvestigate this alert
PagefileThe Pagefile path that the operating system uses as an extension of physical memory. When physical memory (RAM) becomes scarce, the operating system can move infrequently accessed or idle pages to the page file to free up memory for other processes or data.
UsedThe amount of Pagefile usage on the disk. Addition of physical memory should be considered in case of consistent high pagefile usage value.
Read more
MaxIf the Pagefile limit has been manually set, this metric indicates the maximum storage space assigned to the Pagefile.
InitIf the Pagefile limit has been manually set, this metric indicates the initial storage space assigned to the
Pagefile.
Manage typeHow the Pagefile limit (virtual Memory) has been set: manually or automatically (the latter is less recommended).
AllocatedThe physical size assigned to Pagefile (virtual memory) on the disk storage space.

— Services

MetricDescriptionInvestigate this alert
HostThe host name as defined in the AimBetter Configuration
NameThe name of the service in the registry (service’s key)
Display NameA user-friendly name for the service that appears in the Services control panel application
StateThe service status: Running, Stopped or Paused
ModeThe service operation mode: Manual, Automatic, or Disabled.
We recommend setting critical services to Automatic mode.
The Manual setting is for when we want to control the moment the service turns on.
AccountThe authorization level with which the service is running. This is important in cases where we want to grant limited permissions to specific users of the service.
PathThe service executable file location.
RunningThe service status: 0 is Down, and 1 is Up. The graph indicates if a service is either Up or Down. It samples every minute and illustrates the uptime continuity.Read more
HostThe host name as defined in the AimBetter Configuration
StatusIndicates its current state or condition.
DescriptionThe location of the service description. This description includes information and configuration settings of the system service.

— Process

MetricDescriptionInvestigate this alert
HostThe host name as defined in the AimBetter Configuration
User NameThe name of the user running the process.
In cases where a process needs to be shut down due to high system resource consumption, it is important to know who is running the process.
Process NameThe running process name.
CPUThe processor usage (percentage) by the process. High values can lead to system slowness.Read more
MemoryThe amount of physical memory utilized by the process in MB. High values may lead to slowness in system programs and processes.Read more
Page FilesThe amount of Pagefile (virtual memory) used by the process in MB. A high value can be
indicative of a problem with the physical
memory.
Read more
Virtual
Memory
The process’s amount of physical memory and Pagefile (virtual memory).
ReadsThe process’s amount of reads from the physical memory since it started or since the counter was last reset.
WritesThe process’s amount of writes to the physical memory since it started or since the counter was last reset.
Process IDA number identifying the process in the system.
Command
Line
The running command of the executable file which the process is running, including parameters.
Last initializationThe time when the process was initiated.
PathThe path of the executable file.
Page Fault/secThe rate at which page faults occur. It’s used to assess the efficiency of memory management in a system .
UptimeThe duration of time that the process has been continuously running without experiencing a restart or pausing.
Private MemoryA memory exclusively allocated and accessible to a specific process or application. It’s used to store information that is unique to a particular process, such as the process’s variables, stack, and heap.
Shared MemoryA memory that can be accessed by multiple processes simultaneously. It’s used for inter-process communication (IPC), where one process writes data into the shared memory region, and other processes can read that data from the same memory region.
Physical Mem.The actual hardware memory used by the process.
Virtual Mem.The process usage of memory beyond the physical memory such as disk space or swap space as an extension of the RAM memory. The general recommendation is for this value to be as low as possible.

— Images

MetricDescriptionInvestigate this alert
HostThe name of the server where the IMAGE is located, according to the name given in the UI during the installation.
ImageThe Image name. An Image is a compiled binary file that contains the machine code representation of a program or software application after it undergoes the compilation process (the executable source code of a program).
CPUCPU Consumption (%). Values above the average, indicate an increase in system resource consumption, which may lead to system latency.Read more
MemoryThe amount of physical memory (RAM) utilized by the program in MB or GB. High values may lead to slowness in system programs and processes.Read more
Process CountThe number of processes currently running on the server related to the selected IMAGE. A high number of processes can indicate multiple processes that lead to high resource consumption.
PathThe image path.

— MSSQL on Windows

MetricDescriptionInvestigate this alert
VersionThe SQL version, installed on the server
InstanceThe SQL server instance name given in the installation.
Test connectionA check of the time to establish a connection to the SQL server in milliseconds. A high value indicates that there are network communication problems or a load on the SQL server.Read more
Last RestartSQL server last restart
CollationIn SQL Server, a collation is a set of rules that determine how data is sorted and compared, for string based operations. SQL collations allow database administrators to define the appropriate rules for sorting and comparing strings based on the specific language and cultural context of the data being stored.
EditionThe installed SQL Server edition. There are numerous editions, and each edition has two runtimes – 32bit or 64bit—Ex: Express, Developer, Enterprise, etc.
SPThe Service Pack, which includes cumulative updates of all the fixes and improvements from previous service packs and cumulative updates for a specific version of SQL Server.
Page life expectancyThe time SQL keeps the retrieved information from the server’s physical memory in seconds. Low values indicate that the SQL is exchanging the information that arrives from the physical memory at a high frequency and needs more physical memory in order to perform faster.Read more
User
Connections
A connection established between a client application and a database server using SQL credentials is considered a single user connection. A large number may indicate a load on the system, a fault, or a security error.Read more
Connection reuse/secThe total number of logins started from the connection pool per second. Apps tend to open and close connections repeatedly – this value indicates the amount of the connections’ reuse.
Batch requests/secThe number of updates, retrievals, deletions, or saving operations in the SQL per second.
This metric enables the user to detect abnormalities in the operations amount on the SQL server.
Read more
Buffer cache hit ratioThe percentage of memory requests that are satisfied from the cache (physical memory of the SQL server).
Values below 90% indicate multiple reads/writes from/to the main memory or disk storage.
You should investigate whether there is a high physical memory consumption by different programs or processes and consider the need to add physical memory to the SQL server.
Read more
Page reads/secThe amount of Page reads (each page weighs 8Kb) from the disk per second.
Many reads indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
Page writes/secThe amount of Page writes (each page weighs 8Kb) to the disk per second.
Many writes indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
SP CompilationThe number of times the SQL compiles the running programs of the queries per second.
A large amount of program compilation along with a small number of Batch requests indicates large usage of direct queries, p_executesql, and no procedures with determined variables.
SP Re CompilationThe number of times the SQL recompiles the running programs of the queries per second.
A large amount of program recompilation, combined with a small number of Batch requests, indicates that the request retrieves have grown, a statistical update has been performed, or the indexing has been recompiled.
We should investigate the amount of information and whether or not the other operations have been performed.
Page
Lookups
The number of times SQL seeks pages (each page size is 8Kb) from the physical
memory.
(Page lookups/sec) / (Batch requests/sec) greater than 100 indicate that some queries are not running optimally.
Latches TimesThe duration in seconds for which a thread holds exclusive access (“latch”) to a shared resource (for ex. “latched table”).
A high amount of latches causes slowness in data reception from the latched tables. We should investigate a change in the Update or Deletion method.
Page Splits/secThe number of pages per second splitting for allocation purposes in the event that the index does not have space at the frequency of a second.
An amount higher than 20 per second requires a check of the index specifications.
Checkpoint
Pages/sec
The data pages that are written per second to disk during a checkpoint operation. A checkpoint is a process in which the SQL Server ensures that all modified data pages in memory are flushed and written to disk to maintain data consistency and durability.
DB IO/secThe amount of reads and writes of the entire database per second
Target MemoryThe target RAM memory limit that the SQL Server is allowed to consume and utilize for its internal operations.
MemoryThe amount of memory SQL Server is utilizing in MB. If SQL is not using the maximum memory amount specified, we should consider lowering this amount.
Memory
Details
A description (cake) of the division of the physical memory usage of the SQL Server for the database, internal needs, and free memory in MB
DB
Memory
The memory used by the SQL Server instance to cache data and other objects related to specific databases.
Free MemoryThe amount of physical memory not utilized by SQL Server in MB.
A high value may indicate that the assigned memory to the SQL Server can be reduced.
Internal
Memory
The amount of physical memory which the SQL Server is utilizing for internal operations, not including operations for the database, in MB. For example: buffer pool, execution plans, system tables, procedures cache, and management.Read more
Memory (min)The minimum amount of assigned physical memory which the SQL can use in MB.
Memory(max)The maximum amount of assigned physical memory which the SQL can use in MB.
Temp table
creation/sec
Amount of temp table creations per second.
UptimeHow long the SQL server management studio has been up and running.
It’s recommended to have as high uptime value as possible on high-traffic instances.
Cluster active nameThe name of the active cluster in clustered instances.
Cluster nodes downThe amount of cluster nodes that are down. The server may be one of the cluster nodes.
Transactions/secNumber of open transactions per second.
Values higher than usual can cause system slowness.
Read more
Lazy writes/secMeasures the process of flushing modified data pages from memory (buffer cache) to disk (data files) per second. By deferring the immediate disk writes and batching them together, lazy writes help improve system performance. It reduces the frequency of disk I/O operations, minimizes disk access latency, and allows the system to perform multiple writes in a more efficient manner.
A high value indicates that the SQL server needs more memory and can affect other OS resources, such as disk IO and CPU usage.
Index Full scan/secAmount of indexes that were scanned per second. This is an alternative to a full table
scan when the index contains all the columns that are needed for the query, and at least one column in the index key has a “NOT FULL” constraint
Index page
splits/sec
Amount of indexes page split for second, affected by fragmentation. Page split describes a situation when
there’s no dedicated space for
updating/inserting value to the table, the split is to free space for the command to complete.
Logins/secThe number of logins to the SQL server per second. A high value may refer to security or application problems.
Logouts/secThe number of logouts to the SQL server per second. A high value may refer to connection problems.
Core availableTotal cores number in the server.
Core in useThe number of cores that are assigned for SQL Server use. The recommendation is that the SQL Server will use all cores.
Session Memory waitThe number of sessions that are waiting for free memory. Those queries don’t have enough RAM memory to start running, so they are “delayed.”
Create temp table/ variablesThe number of created temporary tables/ variables available. A high value can indicate unnecessary open connections.
TempDB free spaceTempdb database unused data space in KB. A high value may indicate unusual data growth.
Session avg. wait for signalThe average wait time in mili-seconds that the SQL Server reports that he’s in a wait.
The threshold leans on past activity and behavior. When the value is higher than average, it can cause SQL Server slow performance.
Read more
Session CPU waitThe number of queries that SQL Server reports as waiting for CPU availability.
The threshold leans on past activity and behavior. When the value is higher than average, it’s recommended to investigate and look for those queries.
Read more
Currently ActiveThe number of queries that are currently running (status is running; for oracle, status is active)
Currently BlockedThe number of queries that are currently blocked (status is suspended)Read more
Currently SleepingThe number of queries that are currently sleeping. A query that has been executed, and its results have been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Currently BackgroundThe number of queries that are running on the background. This separates from the main execution flow of a program or application in order to prevent blockings.
Currently Open TransactionsThe number of queries that have open transactions at the moment.Read more
Currently KilledThe number of queries that were killed at the last minute.Read more
Currently Avg Duration/secMeasures the average time taken to execute a single query or command in seconds.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Subscriber High latencyThe time it takes for changes made at the publisher to be replicated to the subscriber.Read more
Distributor High latencyThe time it takes for transactional changes generated at the publisher to be delivered to the distributor for further replication processing.Read more
LogReader High latencyThe time it takes for the LogReader agent to read the transaction log from the publisher and deliver the changes to the distributor for replication.Read more

— Oracle on Windows

MetricDescriptionInvestigate this alert
DatabaseThe Database name.
EditionThe Database edition.
32/64The Database runtime – 32bit or 64bit.
VersionThe Database version.
Log ModeRefers to the Database redo logs management, used for data integrity or recovery in a case of disaster.
There are several types:
ARCHIVELOG Mode- allows to create backups that capture changes made to the database since the last backup
NONARCHIVELOG Mode- limits the ability to perform point-in-time recovery since archived redo logs are not available
FORCE LOGGING Mode- all data changes made to the database are logged to the redo log files, even for operations that would not typically generate redo logs
National Language (NLS)Refers to a set of features and settings that allow Oracle Database to handle multiple languages.
Patch LevelThe version number of the Oracle software and the cumulative updates and releases.
Last RestartLast restart in the format date:hour:minute
Test ConnectionA check of the time to establish a connection to the Database in milliseconds. A high value indicates that there are network communication problems or a load on the Oracle Database.Read more
Session LimitThe utlized number of user sessions connected to the database at the moment, out of the maximum sessions allowed, in percentage.
Session (Max)The maximum number of concurrent user sessions allowed to connect to the database. Each user session represents a connection with the database.
Processes (Max)The maximum number of concurrent user processes allowed to connect to the database.
Default block sizeThe standard size of a data block used for storing data and managing database objects within the database’s data files. As of Oracle Database 12c, the default block size is typically 8192 bytes (8.19 KB) for a general-purpose database.
Open Cursors (max)The maximum possible open cursors in the database. It is a programmatic handle or pointer used by the database to access or process the results of queries or DML statements. It is essential for developers to explicitly close cursors after they are no longer needed.
DR Last Sync DateThe last synchronization date and time for a Data Guard configuration (high-availability and disaster recovery solution that allows to maintain standby databases synchronized with the primary database).Read more
Physical ReadsThe amount of reads of the entire database measured in blocks as defind in “Default Block Size”(defauly is 8KB).
Physical WritesThe amount of writes of the entire database measured in blocks as defind in “Default Block Size”(defauly is 8KB).
DR Full BackupThe last date and time of a full backup taken from the primary database and used to initialize or restore a standby database in a Data Guard configuration.Read more
Archive Log BackupThe last date and time of a backup operation that specifically targets the archived redo logs.Read more
CTL SP File BackupThe last date and time of a backup of the “control file and server parameter (SP) file.” It helps to ensure the recoverability of the database in case of disasters, media failures, or user errors.Read more
Avg. ThreadsThe average amount of concurrent executions of multiple tasks or processes, such as: Operating System threads, Java threads, Database sessions, or parallel query executions. Calculated as the amount of active sessions / CPU cores (in percentage).
Database CPU TimeThe remaining CPU time in percentage to the execution of SQL statements by the Oracle database processes and other database-related operations. Higher values mean less waits for CPU improving performance. Best practices include minimum scans and hold-ons in queries executions.
Buffer Cache HitThe percentage of a requested data block found in the database buffer cache, thereby avoiding the need to read the block from disk.Read more
PGA Cache HitThe percentage of times process data requests are found in the Global Area (PGA) cache allocation, without a need for additional memory or read from disk. The higher this value, the more efficient this database is.Read more
DeadlocksThe amount of deadlocks in the database server.Read more
Invalid ObjectThe amount of database objects that are currently in an invalid state.
Redo Entries (rows update)The amount of records that capture changes made to the database, related to redo log. When this value is higher than usual, it may indicate a possible cause for slowness in performance.
Query ExecuteThe amount of queries executed at the moment.
Avg. SessionsThe average amount of sessions at the moment.
Avg. ActiveThe average amount of active sessions at the moment.
Avg. BlockingThe average amount of blocking sessions at the moment.Read more
Avg. Sleeping BlockingThe average amount of both sleeping and blocking sessions at the moment.
Avg. BlockedThe average amount of blocked sessions at the moment.Read more
Avg. Open TransactionThe average amount of open transactions at the moment.Read more
Avg. SleepingThe average amount of sleeping sessions at the moment. A query that has been executed, and its result has been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Avg. BackgroundThe average amount of background sessions at the moment. This separates from the main execution flow of a program or application in order to prevent blockings.
Avg. DurationA count of the average duration in seconds of all queries running at the moment in different sessions.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Archive Logs RetentionThe retention period for archived redo log files.
Log SwitchThe count of log file switch completion which is when the database switches from writing redo log entries from one redo log group (also known as a redo log file) to another.

— DB on Windows MSSQL

MetricDescriptionInvestigate this alert
StatusDatabase Status:
● Online – the database is available
● Offline – the database is not in use
● Mirror Disconnect – the sync is disconnected.
● Mirror Principal – the principal sync of all updating of the database.
● Mirror – the database is synchronized.
● Restoring – the database is currently being restored
● Suspect – the database is defective
Read more
InstanceThe SQL server instance name given in the installation.
DatabaseThe Database name
RecoveryThe recovery model determines the possible restore options specified for the database. It defines how the database transaction logs are managed and which data type can be recovered in case of a failure.
Full BackupThe date of the last Full Backup performed on the database.
The Full Backup documents are .bak files or snapshots.
A Full Backup once a day is the general recommendation.
Read more
Log BackupThe date of the last Log changes backup performed on the database.
The Log Backup documents are .trn files
A log backup once an hour is the general recommendation but if the recovery model is “simple” this value should be null.
Read more
Diff BackupDate of the last differential backup performed for the Database.
In general, it’s recommended once a day but it depends on the full backup frequency.
Read more
MemoryThe amount of memory the database is taking up in the physical memory in MB.
SizeA description (cake) of the distribution of data and log file sizes occupying the disk storage, measured in MB. It is not recommended that the log takes up more than 60% of the database size. We should investigate the process integrity of this database, such as transactions (containing recursion) and backups. Bandwidth relates to the speed at which data can be transmitted between devices. A higher bandwidth allows for faster and more extensive data.
Disk IO/secAmount of disk reads and writes per second.
Usually, the main or biggest DBs will have a high value. Higher values than usual can indicate a performance problem of queries causing other queries to wait for free IO.
Data GrowthThe rate of information growth in the database on the disk storage in MB, which includes all the filegroups that contain the primary data file (.mdf).
A lack of space in the disk storage may indicate substantial data growth in the database.
Read more
Log GrowthThe rate of log growth in the database on the disk storage in MB.
A lack of space in the disk storage may indicate substantial log growth in the database.
Read more
In-MemoryKnown as In-Memory OLTP, a feature in SQL Server that leverages memory-resident tables and natively compiled stored procedures for a better performance in specific transactions.
Unused data spaceFree space of the DB – data that is not in use.
A value higher than 50% indicates that a shrink should be considered for the data growth.
At least 10% of the DB unused space for indexes and more is recommended.
CollationThe language and the manner of string comparison defined for the database.
Page VerifyPage Verify is a database option that defines the SQL Server mechanism of verifying page consistency when the page is written to disk and when it is read again from disk.
The recommendation is CHECKSUM.
DBCC last successLast successful Database check. Checks the database’s integrity, tables, indexes, schema, etc.
Running this test on a daily basis is very important for the proper functioning of the organization with the databases. The test runs on both a physical level and a logical level.
CompatibilityThe Compiler version at the Database level.
TransactionsThe number of transaction operations UPDATE, INSERT, DELETE, BEGIN TRAN executed per second.
A high value (above average) may be the reason for slowness or log growth issues.
Log FlushThe time it takes to save the log found in the physical memory to the disk storage.
High values affect Transaction operations, Update, and saving to SQL times causing slowness.
File stream GrowthFile streams use of storage volume.
File streams enables the storage of large amount of data (more than data 2GB storage) such as large documents, images or files.
High values may cause storage problem to the data drive.
File stream DriveThe file stream’s drive.
IOThe amount of read and write operations from the disk storage at the sampled time.
A high value can cause slowness as a result of a load on the disk storage
Log sizeThe size assigned to the log files of the database in MB.
Log UseThe size of the log used in MB
Log FlushThe process of writing the contents of the transaction log buffer to the physical transaction log file on the disk, measured in milliseconds.
Higher time may increase the chance for data loss.
Log Reuse WaitA condition where the transaction log of a database is unable to reuse or truncate log space for reuse. NOTHING is a good value for it. REPLICATION is for a database in replication program.
Creation DateThe Database creation date.
Data FilesThe number of .mdf files the database contains (filegroups).
Log FilesThe number of .ldf files the database contains.
Disk Log IO/secThe number of logs input and output from the disk storage per second.
Data Read IO/secThe number of reads from the disk storage per second.
Data Write IO/secThe number of writes from the disk storage per second.
Open transactionsThe number of open transactions per second. A high number of open transactions can cause log oversize.Read more
Log transactionsThe log amount in MB while there’s an open transaction. An increased log growth can be caused when transactions don’t clean themselves while running.
Transaction DurationThe duration of the transaction.
Alwayson StateThe DB’s AlwaysOn state- can be synchronized or not synchronized. Not synchronized means that there’s a problem with the AlwaysOn.Read more
AlwaysOn StatusAlwaysOn Status- may be healthy or not healthy. Not healthy means that there’s a problem with the AlwaysOn.Read more
AlwaysOn graphIf it’s active- the value is 1. If it is not active -the value is 0.
AlwaysOn Log records not committed at SecondaryAmount of the AlwaysOn logs that couldn’t be committed yet from the primary to the secondary server.
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn Log records waiting to send to SecondaryAmount of the AlwaysOn logs waiting to be sent to the secondary server (from the primary server).
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn is Primary0 for secondary server databases, 1 for primary server databases
AlwaysOn group nameThe group that a DB in AlwaysON is related to.
The group can hold several databases in the enterprise edition.
Mirror StatusThe DB’s Mirror status- can be synchronized or not synchronized.
Not synchronized means that there’s a problem with the Mirror.
Read more
Mirror Status GraphIf it’s active- the value is 1. If not-the value is 0.
Mirror ModeThe DB’s mirror mode on the primary server is “principal,” and on the secondary server is “mirror.”Read more
Mirror Log records not committed at SecondaryAmount of the Mirror logs that couldn’t be committed yet from the primary to the secondary server.Read more
Mirror Log records waiting to send to SecondaryAmount of the Mirror logs waiting to be sent to the secondary server (from the primary server).
Data DriveThe drive where the data files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.
Log DriveThe drive where the log files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.

— Wait Stats on Windows MSSQL

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

— Wait Stats on Windows Oracle

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

— App Pools

MetricDescriptionInvestigate this alert
App PoolThe application pool name
StateThe application pool status- running or stopped.
Total App RecyclesAmount of application recycles.
It is an indication of a problem if there is some value without a related scheduled task.
UptimeRange of time of the application uptime in days. This value should be as higher as possible.
ProcessThe App Pool’s process name according to a specific ID. Shows only if there’s a consumption of CPU, Memory, or Page file.
CPUThe processor usage (%) by the app pool. Shows above a minimum defined value that was marked as influencing and significant.
MemoryThe RAM memory consumption of the app pool in MB. Shows above a minimum defined value that was marked as influencing and significant.
Page filesThe paging consumption of the app pool in MB. Shows above a minimum defined value that was marked as influencing and significant.

— Web Sites

MetricDescriptionInvestigate this alert
Web SiteWeb site name
App PoolThe application pool name of the website
Current ConnectionsThe number of connections to the website at the moment.
A high value indicates that there are probably unnecessary open sites.
Get Request/SecThe amount of get requests per second to the website at the moment.
A high value indicates that there is probably a higher activity that may cause slowness.
Post Requests/secThe amount of post requests per second to the website at the moment.
A high value indicates that there is probably a higher activity that may cause slowness.
Bytes Received/SecThe amount of received bytes per second.
A high value indicates that there is probably a higher activity that may cause slowness.
Bytes Sent/SecThe amount of sent bytes per second.
A high value indicates that there is probably a higher activity that may cause slowness.
Deleted Requests/SecThe amount of deleted requests per second. If not 0, it indicates an error status.
Files received/secThe number of received files per second.
Files sent/secThe amount of sent files per second.
Not found Errors/secThe number of errors of “page not found” status- 404. Indicates a problem for getting to the URL page.
Put Requests/SecThe amounts of put requests per second.
A high value indicates that there is probably a higher activity that may cause slowness.

-Linux

— Hosts

MetricDescriptionInvestigate this alert
CPU UsageThe overall percentage of time the CPU spends executing non-idle tasks. High CPU utilization may indicate that the CPU is under heavy load and could be a bottleneck in system performance.Read more
Total memoryThe total amount of RAM (Random Access Memory) in GB. It is a volatile memory that provides temporary storage for data and instructions that the CPU needs to access quickly.Read more
Memory freeThe amount of system memory (RAM) in GB that is currently not used by any active processes or applications. It represents the memory portion readily available for immediate allocation and usage by the operating system or any newly launched programs.Read more
Memory free % (percentage)The percentage level of the system’s free memory in comparison to the total memory.Read more
Ping Lost Packets (0-12)The amount of unsuccessful communication integrity checks out of 12 attempts.Read more
Buffer CacheA portion of the system memory used to cache data from disk storage devices. The buffer cache operates as a dynamic pool of memory that grows or shrinks based on the system’s demand for caching data.
Swap CacheA cache for swapped-out pages that allows the kernel to quickly access and retrieve frequently accessed pages from memory without needing to perform disk I/O.
Total SwapThe total amount of swap space available on the system that functions as a designated area on the disk used by the Operating System as virtual memory when the physical memory is fully utilized.
Used SwapThe amount of used swap space. Low values mean the machine has free physical memory.
Free SwapThe amount of available swap space out of the total swap. High values mean free physical memory.

— Network

Metric

DescriptionInvestigate this alert
Card NameThe sampled network card name
ModelThe name and model number of a specific network interface card (NIC).
Receive
Kbyte (sec)
The amount of data received through the network card in Kbps (KB per second). High values indicate that the server is receiving large amounts of data which can be the cause of system slowness.
Send
Kbyte (sec)
The amount of data sent through the network card in Kbps (KB per second). High values indicate that the server is sending large amounts of data which can be the cause of system slowness.

— Disk

MetricDescriptionInvestigate this alert
DriveThe drive name.
File SystemA structured collection of files on a disk drive or a partition, which is a segment of memory containing some specific data.
MountedA specified directory(eg. /usr/local) in the file system hierarchy to access the disk content.
TotalThe total disk storage capacity (in GB).
Disk
Usage (GB)
The disk storage usage in GB. Usage higher than 95% of the storage space can lead to loss of information and the integrity of programs and processes in the system.Read more
Disk FreeThe free disk storage space in GB. Low free storage space can lead to loss of information and the integrity of the programs and processes in the system.Read more
IO Write(sec)The amount of writing to the disk per second. If the writing amount is high, the system will respond slowly.
IO Read(sec)The amount of reading from the disk per second. If the amount of reading is high, the system response may be slow.
Utilization % (%)The percentage of the used disk storage space in GB. A high percentage indicates that disk storage capacity almost reached its limit. It’s recommended to look for the processes or files that consume most of the storage.Read more

— Swaps

MetricDescriptionInvestigate this alert
FileNameThe swap file name.
TypeOne of the two main types of swap storage: swap partitions and swap files. Swap file is defined as a regular file on the file system that is used as virtual memory when the physical memory is full. Swap partition is a dedicated section of a hard disk or solid-state drive (SSD) that is reserved seperately for use as swap space.
Total SizeThe total amount of available storage for the swap file.
UsedThe storage used for the swap file.

— Services

MetricDescriptionInvestigate this alert
HostThe host name as defined in the AimBetter Configuration
ServiceThe name of the service in the registry (service’s key)
LoadedIn Linux, a service can be “loaded” or “not-found”. “Loaded” refers to a service that is currently configured to start automatically during system boot-up and found in the system’s management of services. “Not-Found” refers to a configuration file or service unit that does not exist, therefore it can’t be managed in the system’s management of services.
IsActiveIndicates whether the service is running and available for use or not.
StatusIndicates its current state or condition.

— Process

MetricDescriptionInvestigate this alert
HostThe host name as defined in the AimBetter Configuration
CPUThe processor usage (percentage) by the process. High values can lead to system slowness.Read more
Process IDA number identifying the process in the system.
Command
Line
The running command of the executable file which the process is running, including parameters.
Physical Mem.The actual hardware memory used by the process.
Virtual Mem.The process usage of memory beyond the physical memory such as disk space or swap space as an extension of the RAM memory. The general recommendation is for this value to be as low as possible.
User NameThe name of the user running the process.
In cases where a process needs to be shut down due to high system resource consumption, it is important to know who is running the process.
StartThe time when the process was started.
StatusThe process status. In Linux it can be: Ss, Sl or both.
TerminalThe process terminal.

— MSSQL on Linux

MetricDescriptionInvestigate this alert
VersionThe SQL version, installed on the server
InstanceThe SQL server instance name given in the installation.
Test connectionA check of the time to establish a connection to the SQL server in milliseconds. A high value indicates that there are network communication problems or a load on the SQL server.Read more
Last RestartSQL server last restart
CollationIn SQL Server, a collation is a set of rules that determine how data is sorted and compared, for string based operations. SQL collations allow database administrators to define the appropriate rules for sorting and comparing strings based on the specific language and cultural context of the data being stored.
EditionThe installed SQL Server edition. There are numerous editions, and each edition has two runtimes – 32bit or 64bit—Ex: Express, Developer, Enterprise, etc.
SPThe Service Pack, which includes cumulative updates of all the fixes and improvements from previous service packs and cumulative updates for a specific version of SQL Server.
Page life expectancyThe time SQL keeps the retrieved information from the server’s physical memory in seconds. Low values indicate that the SQL is exchanging the information that arrives from the physical memory at a high frequency and needs more physical memory in order to perform faster.Read more
User
Connections
A connection established between a client application and a database server using SQL credentials is considered a single user connection. A large number may indicate a load on the system, a fault, or a security error.Read more
Connection reuse/secThe total number of logins started from the connection pool per second. Apps tend to open and close connections repeatedly – this value indicates the amount of the connections’ reuse.
Batch requests/secThe number of updates, retrievals, deletions, or saving operations in the SQL per second.
This metric enables the user to detect abnormalities in the operations amount on the SQL server.
Read more
Buffer cache hit ratioThe percentage of memory requests that are satisfied from the cache (physical memory of the SQL server).
Values below 90% indicate multiple reads/writes from/to the main memory or disk storage.
You should investigate whether there is a high physical memory consumption by different programs or processes and consider the need to add physical memory to the SQL server.
Read more
Page reads/secThe amount of Page reads (each page weighs 8Kb) from the disk per second.
Many reads indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
Page writes/secThe amount of Page writes (each page weighs 8Kb) to the disk per second.
Many writes indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
SP CompilationThe number of times the SQL compiles the running programs of the queries per second.
A large amount of program compilation along with a small number of Batch requests indicates large usage of direct queries, p_executesql, and no procedures with determined variables.
SP Re CompilationThe number of times the SQL recompiles the running programs of the queries per second.
A large amount of program recompilation, combined with a small number of Batch requests, indicates that the request retrieves have grown, a statistical update has been performed, or the indexing has been recompiled.
We should investigate the amount of information and whether or not the other operations have been performed.
Page
Lookups
The number of times SQL seeks pages (each page size is 8Kb) from the physical
memory.
(Page lookups/sec) / (Batch requests/sec) greater than 100 indicate that some queries are not running optimally.
Latches TimesThe duration in seconds for which a thread holds exclusive access (“latch”) to a shared resource (for ex. “latched table”).
A high amount of latches causes slowness in data reception from the latched tables. We should investigate a change in the Update or Deletion method.
Page Splits/secThe number of pages per second splitting for allocation purposes in the event that the index does not have space at the frequency of a second.
An amount higher than 20 per second requires a check of the index specifications.
Checkpoint
Pages/sec
The data pages that are written per second to disk during a checkpoint operation. A checkpoint is a process in which the SQL Server ensures that all modified data pages in memory are flushed and written to disk to maintain data consistency and durability.
DB IO/secThe amount of reads and writes of the entire database per second
Target MemoryThe target RAM memory limit that the SQL Server is allowed to consume and utilize for its internal operations.
MemoryThe amount of memory SQL Server is utilizing in MB. If SQL is not using the maximum memory amount specified, we should consider lowering this amount.
Memory
Details
A description (cake) of the division of the physical memory usage of the SQL Server for the database, internal needs, and free memory in MB
DB
Memory
The memory used by the SQL Server instance to cache data and other objects related to specific databases.
Free MemoryThe amount of physical memory not utilized by SQL Server in MB.
A high value may indicate that the assigned memory to the SQL Server can be reduced.
Internal
Memory
The amount of physical memory which the SQL Server is utilizing for internal operations, not including operations for the database, in MB. For example: buffer pool, execution plans, system tables, procedures cache, and management.Read more
Memory (min)The minimum amount of assigned physical memory which the SQL can use in MB.
Memory(max)The maximum amount of assigned physical memory which the SQL can use in MB.
Temp table
creation/sec
Amount of temp table creations per second.
UptimeHow long the SQL server management studio has been up and running.
It’s recommended to have as high uptime value as possible on high-traffic instances.
Cluster active nameThe name of the active cluster in clustered instances.
Cluster nodes downThe amount of cluster nodes that are down. The server may be one of the cluster nodes.
Transactions/secNumber of open transactions per second.
Values higher than usual can cause system slowness.
Read more
Lazy writes/secMeasures the process of flushing modified data pages from memory (buffer cache) to disk (data files) per second. By deferring the immediate disk writes and batching them together, lazy writes help improve system performance. It reduces the frequency of disk I/O operations, minimizes disk access latency, and allows the system to perform multiple writes in a more efficient manner.
A high value indicates that the SQL server needs more memory and can affect other OS resources, such as disk IO and CPU usage.
Index Full scan/secAmount of indexes that were scanned per second. This is an alternative to a full table
scan when the index contains all the columns that are needed for the query, and at least one column in the index key has a “NOT FULL” constraint
Index page
splits/sec
Amount of indexes page split for second, affected by fragmentation. Page split describes a situation when
there’s no dedicated space for
updating/inserting value to the table, the split is to free space for the command to complete.
Logins/secThe number of logins to the SQL server per second. A high value may refer to security or application problems.
Logouts/secThe number of logouts to the SQL server per second. A high value may refer to connection problems.
Core availableTotal cores number in the server.
Core in useThe number of cores that are assigned for SQL Server use. The recommendation is that the SQL Server will use all cores.
Session Memory waitThe number of sessions that are waiting for free memory. Those queries don’t have enough RAM memory to start running, so they are “delayed.”
Create temp table/ variablesThe number of created temporary tables/ variables available. A high value can indicate unnecessary open connections.
TempDB free spaceTempdb database unused data space in KB. A high value may indicate unusual data growth.
Session avg. wait for signalThe average wait time in mili-seconds that the SQL Server reports that he’s in a wait.
The threshold leans on past activity and behavior. When the value is higher than average, it can cause SQL Server slow performance.
Read more
Session CPU waitThe number of queries that SQL Server reports as waiting for CPU availability.
The threshold leans on past activity and behavior. When the value is higher than average, it’s recommended to investigate and look for those queries.
Read more
Currently ActiveThe number of queries that are currently running (status is running; for oracle, status is active)
Currently BlockedThe number of queries that are currently blocked (status is suspended)Read more
Currently SleepingThe number of queries that are currently sleeping. A query that has been executed, and its results have been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Currently BackgroundThe number of queries that are running on the background. This separates from the main execution flow of a program or application in order to prevent blockings.
Currently Open TransactionsThe number of queries that have open transactions at the moment.Read more
Currently KilledThe number of queries that were killed at the last minute.Read more
Currently Avg Duration/secMeasures the average time taken to execute a single query or command in seconds.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Subscriber High latencyThe time it takes for changes made at the publisher to be replicated to the subscriber.Read more
Distributor High latencyThe time it takes for transactional changes generated at the publisher to be delivered to the distributor for further replication processing.Read more
LogReader High latencyThe time it takes for the LogReader agent to read the transaction log from the publisher and deliver the changes to the distributor for replication.Read more

— Oracle on Linux

MetricDescriptionInvestigate this alert
DatabaseThe Database name.
EditionThe Database edition.
32/64The Database runtime – 32bit or 64bit.
VersionThe Database version.
Log ModeRefers to the Database redo logs management, used for data integrity or recovery in a case of disaster.
There are several types:
ARCHIVELOG Mode- allows to create backups that capture changes made to the database since the last backup
NONARCHIVELOG Mode- limits the ability to perform point-in-time recovery since archived redo logs are not available
FORCE LOGGING Mode- all data changes made to the database are logged to the redo log files, even for operations that would not typically generate redo logs
National Language (NLS)Refers to a set of features and settings that allow Oracle Database to handle multiple languages.
Patch LevelThe version number of the Oracle software and the cumulative updates and releases.
Last RestartLast restart in the format date:hour:minute
Test ConnectionA check of the time to establish a connection to the Database in milliseconds. A high value indicates that there are network communication problems or a load on the Oracle Database.Read more
Session LimitThe utlized number of user sessions connected to the database at the moment, out of the maximum sessions allowed, in percentage.
Session (Max)The maximum number of concurrent user sessions allowed to connect to the database. Each user session represents a connection with the database.
Processes (Max)The maximum number of concurrent user processes allowed to connect to the database.
Default block sizeThe standard size of a data block used for storing data and managing database objects within the database’s data files. As of Oracle Database 12c, the default block size is typically 8192 bytes (8.19 KB) for a general-purpose database.
Open Cursors (max)The maximum possible open cursors in the database. It is a programmatic handle or pointer used by the database to access or process the results of queries or DML statements. It is essential for developers to explicitly close cursors after they are no longer needed.
DR Last Sync DateThe last synchronization date and time for a Data Guard configuration (high-availability and disaster recovery solution that allows to maintain standby databases synchronized with the primary database).Read more
Physical ReadsThe amount of reads of the entire database measured in blocks as defind in “Default Block Size”(defauly is 8KB).
Physical WritesThe amount of writes of the entire database measured in blocks as defind in “Default Block Size”(defauly is 8KB).
DR Full BackupThe last date and time of a full backup taken from the primary database and used to initialize or restore a standby database in a Data Guard configuration.Read more
Archive Log BackupThe last date and time of a backup operation that specifically targets the archived redo logs.Read more
CTL SP File BackupThe last date and time of a backup of the “control file and server parameter (SP) file.” It helps to ensure the recoverability of the database in case of disasters, media failures, or user errors.Read more
Avg. ThreadsThe average amount of concurrent executions of multiple tasks or processes, such as: Operating System threads, Java threads, Database sessions, or parallel query executions. Calculated as the amount of active sessions / CPU cores (in percentage).
Database CPU TimeThe remaining CPU time in percentage to the execution of SQL statements by the Oracle database processes and other database-related operations. Higher values mean less waits for CPU improving performance. Best practices include minimum scans and hold-ons in queries executions.
Buffer Cache HitThe percentage of a requested data block found in the database buffer cache, thereby avoiding the need to read the block from disk.Read more
PGA Cache HitThe percentage of times process data requests are found in the Global Area (PGA) cache allocation, without a need for additional memory or read from disk. The higher this value, the more efficient this database is.Read more
DeadlocksThe amount of deadlocks in the database server.Read more
Invalid ObjectThe amount of database objects that are currently in an invalid state.
Redo Entries (rows update)The amount of records that capture changes made to the database, related to redo log. When this value is higher than usual, it may indicate a possible cause for slowness in performance.
Query ExecuteThe amount of queries executed at the moment.
Avg. SessionsThe average amount of sessions at the moment.
Avg. ActiveThe average amount of active sessions at the moment.
Avg. BlockingThe average amount of blocking sessions at the moment.Read more
Avg. Sleeping BlockingThe average amount of both sleeping and blocking sessions at the moment.
Avg. BlockedThe average amount of blocked sessions at the moment.Read more
Avg. Open TransactionThe average amount of open transactions at the moment.Read more
Avg. SleepingThe average amount of sleeping sessions at the moment. A query that has been executed, and its result has been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Avg. BackgroundThe average amount of background sessions at the moment. This separates from the main execution flow of a program or application in order to prevent blockings.
Avg. DurationA count of the average duration in seconds of all queries running at the moment in different sessions.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Archive Logs RetentionThe retention period for archived redo log files.
Log SwitchThe count of log file switch completion which is when the database switches from writing redo log entries from one redo log group (also known as a redo log file) to another.

— DB on Linux MSSQL

MetricDescriptionInvestigate this alert
StatusDatabase Status:
● Online – the database is available
● Offline – the database is not in use
● Mirror Disconnect – the sync is disconnected.
● Mirror Principal – the principal sync of all updating of the database.
● Mirror – the database is synchronized.
● Restoring – the database is currently being restored
● Suspect – the database is defective
Read more
InstanceThe SQL server instance name given in the installation.
DatabaseThe Database name
RecoveryThe recovery model determines the possible restore options specified for the database. It defines how the database transaction logs are managed and which data type can be recovered in case of a failure.
Full BackupThe date of the last Full Backup performed on the database.
The Full Backup documents are .bak files or snapshots.
A Full Backup once a day is the general recommendation.
Read more
Log BackupThe date of the last Log changes backup performed on the database.
The Log Backup documents are .trn files
A log backup once an hour is the general recommendation but if the recovery model is “simple” this value should be null.
Read more
MemoryThe amount of memory the database is taking up in the physical memory in MB.
SizeA description (cake) of the distribution of data and log file sizes occupying the disk storage, measured in MB. It is not recommended that the log takes up more than 60% of the database size. We should investigate the process integrity of this database, such as transactions (containing recursion) and backups. Bandwidth relates to the speed at which data can be transmitted between devices. A higher bandwidth allows for faster and more extensive data.
Disk IO/secAmount of disk reads and writes per second.
Usually, the main or biggest DBs will have a high value. Higher values than usual can indicate a performance problem of queries causing other queries to wait for free IO.
Data GrowthThe rate of information growth in the database on the disk storage in MB, which includes all the filegroups that contain the primary data file (.mdf).
A lack of space in the disk storage may indicate substantial data growth in the database.
Read more
Log GrowthThe rate of log growth in the database on the disk storage in MB.
A lack of space in the disk storage may indicate substantial log growth in the database.
Read more
In-MemoryKnown as In-Memory OLTP, a feature in SQL Server that leverages memory-resident tables and natively compiled stored procedures for a better performance in specific transactions.
Unused data spaceFree space of the DB – data that is not in use.
A value higher than 50% indicates that a shrink should be considered for the data growth.
At least 10% of the DB unused space for indexes and more is recommended.
CollationThe language and the manner of string comparison defined for the database.
Page VerifyPage Verify is a database option that defines the SQL Server mechanism of verifying page consistency when the page is written to disk and when it is read again from disk.
The recommendation is CHECKSUM.
DBCC last successLast successful Database check. Checks the database’s integrity, tables, indexes, schema, etc.
Running this test on a daily basis is very important for the proper functioning of the organization with the databases. The test runs on both a physical level and a logical level.
CompatibilityThe Compiler version at the Database level.
Diff BackupDate of the last differential backup performed for the Database.
In general, it’s recommended once a day but it depends on the full backup frequency.
Read more
TransactionsThe number of transaction operations UPDATE, INSERT, DELETE, BEGIN TRAN executed per second.
A high value (above average) may be the reason for slowness or log growth issues.
Log FlushThe time it takes to save the log found in the physical memory to the disk storage.
High values affect Transaction operations, Update, and saving to SQL times causing slowness.
File stream GrowthFile streams use of storage volume.
File streams enables the storage of large amount of data (more than data 2GB storage) such as large documents, images or files.
High values may cause storage problem to the data drive.
File stream DriveThe file stream’s drive.
IOThe amount of read and write operations from the disk storage at the sampled time.
A high value can cause slowness as a result of a load on the disk storage
Log sizeThe size assigned to the log files of the database in MB.
Log UseThe size of the log used in MB
Log FlushThe process of writing the contents of the transaction log buffer to the physical transaction log file on the disk, measured in milliseconds.
Higher time may increase the chance for data loss.
Log Reuse WaitA condition where the transaction log of a database is unable to reuse or truncate log space for reuse. NOTHING is a good value for it. REPLICATION is for a database in replication program.
Creation DateThe Database creation date.
Data FilesThe number of .mdf files the database contains (filegroups).
Log FilesThe number of .ldf files the database contains.
Disk Log IO/secThe number of logs input and output from the disk storage per second.
Data Read IO/secThe number of reads from the disk storage per second.
Data Write IO/secThe number of writes from the disk storage per second.
Open transactionsThe number of open transactions per second. A high number of open transactions can cause log oversize.Read more
Log transactionsThe log amount in MB while there’s an open transaction. An increased log growth can be caused when transactions don’t clean themselves while running.
Transaction DurationThe duration of the transaction.
Alwayson StateThe DB’s AlwaysOn state- can be synchronized or not synchronized. Not synchronized means that there’s a problem with the AlwaysOn.Read more
AlwaysOn StatusAlwaysOn Status- may be healthy or not healthy. Not healthy means that there’s a problem with the AlwaysOn.Read more
AlwaysOn graphIf it’s active- the value is 1. If it is not active -the value is 0.
AlwaysOn Log records not committed at SecondaryAmount of the AlwaysOn logs that couldn’t be committed yet from the primary to the secondary server.
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn Log records waiting to send to SecondaryAmount of the AlwaysOn logs waiting to be sent to the secondary server (from the primary server).
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn is Primary0 for secondary server databases, 1 for primary server databases
AlwaysOn group nameThe group that a DB in AlwaysON is related to.
The group can hold several databases in the enterprise edition.
Mirror StatusThe DB’s Mirror status- can be synchronized or not synchronized.
Not synchronized means that there’s a problem with the Mirror.
Read more
Mirror Status GraphIf it’s active- the value is 1. If not-the value is 0.
Mirror ModeThe DB’s mirror mode on the primary server is “principal,” and on the secondary server is “mirror.”Read more
Mirror Log records not committed at SecondaryAmount of the Mirror logs that couldn’t be committed yet from the primary to the secondary server.Read more
Mirror Log records waiting to send to SecondaryAmount of the Mirror logs waiting to be sent to the secondary server (from the primary server).
Data DriveThe drive where the data files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.
Log DriveThe drive where the log files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.

— Wait Stats on Linux MSSQL

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

— Wait Stats on Linux Oracle

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

-Azure

— Service

Metric

DescriptionInvestigate this alert
CloudThe Database name given in the Aimbetter configuration.
SKUSKU (Stock Keeping Unit) is an identifier used to represent different service tiers and performance levels for Azure products.
CapacityThe capacity for Azure SQL Database, it’s defined by the selected service tier and the number of DTUs.
DTUDatabase Throughput Units – a performance metric to measure the resources usage (Data IO, Log Write, CPU) by percentage. Higher value may indicate about overload on the cloud due to high performance.Read more
Free Storage SpaceThe free storage space in GB out of the allocated storage.
Data I/OThe amount of data that is readen/written from storage in percentage.
Log WriteThe process of writing the contents of the transaction log in percentage.
CPUThe overall percentage of time the CPU spends executing non-idle tasks.
Max WorkerThe percentage of worker threads that can be used (out of maximum possible) to process concurrent queries and tasks within the database engine.Read more

— MSSQL on Azure

MetricDescriptionInvestigate this alert
VersionThe SQL version, installed on the server
InstanceThe SQL server instance name given in the installation.
Test connectionA check of the time to establish a connection to the SQL server in milliseconds. A high value indicates that there are network communication problems or a load on the SQL server.Read more
Last RestartSQL server last restart
CollationIn SQL Server, a collation is a set of rules that determine how data is sorted and compared, for string based operations. SQL collations allow database administrators to define the appropriate rules for sorting and comparing strings based on the specific language and cultural context of the data being stored.
EditionThe installed SQL Server edition. There are numerous editions, and each edition has two runtimes – 32bit or 64bit—Ex: Express, Developer, Enterprise, etc.
SPThe Service Pack, which includes cumulative updates of all the fixes and improvements from previous service packs and cumulative updates for a specific version of SQL Server.
Page life expectancyThe time SQL keeps the retrieved information from the server’s physical memory in seconds. Low values indicate that the SQL is exchanging the information that arrives from the physical memory at a high frequency and needs more physical memory in order to perform faster.Read more
User
Connections
A connection established between a client application and a database server using SQL credentials is considered a single user connection. A large number may indicate a load on the system, a fault, or a security error.Read more
Connection reuse/secThe total number of logins started from the connection pool per second. Apps tend to open and close connections repeatedly – this value indicates the amount of the connections’ reuse.
Batch requests/secThe number of updates, retrievals, deletions, or saving operations in the SQL per second.
This metric enables the user to detect abnormalities in the operations amount on the SQL server.
Read more
Buffer cache hit ratioThe percentage of memory requests that are satisfied from the cache (physical memory of the SQL server).
Values below 90% indicate multiple reads/writes from/to the main memory or disk storage.
You should investigate whether there is a high physical memory consumption by different programs or processes and consider the need to add physical memory to the SQL server.
Read more
Page reads/secThe amount of Page reads (each page weighs 8Kb) from the disk per second.
Many reads indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
Page writes/secThe amount of Page writes (each page weighs 8Kb) to the disk per second.
Many writes indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
SP CompilationThe number of times the SQL compiles the running programs of the queries per second.
A large amount of program compilation along with a small number of Batch requests indicates large usage of direct queries, p_executesql, and no procedures with determined variables.
SP Re CompilationThe number of times the SQL recompiles the running programs of the queries per second.
A large amount of program recompilation, combined with a small number of Batch requests, indicates that the request retrieves have grown, a statistical update has been performed, or the indexing has been recompiled.
We should investigate the amount of information and whether or not the other operations have been performed.
Page
Lookups
The number of times SQL seeks pages (each page size is 8Kb) from the physical
memory.
(Page lookups/sec) / (Batch requests/sec) greater than 100 indicate that some queries are not running optimally.
Latches TimesThe duration in seconds for which a thread holds exclusive access (“latch”) to a shared resource (for ex. “latched table”).
A high amount of latches causes slowness in data reception from the latched tables. We should investigate a change in the Update or Deletion method.
Page Splits/secThe number of pages per second splitting for allocation purposes in the event that the index does not have space at the frequency of a second.
An amount higher than 20 per second requires a check of the index specifications.
Checkpoint
Pages/sec
The data pages that are written per second to disk during a checkpoint operation. A checkpoint is a process in which the SQL Server ensures that all modified data pages in memory are flushed and written to disk to maintain data consistency and durability.
DB IO/secThe amount of reads and writes of the entire database per second
Target MemoryThe target RAM memory limit that the SQL Server is allowed to consume and utilize for its internal operations.
MemoryThe amount of memory SQL Server is utilizing in MB. If SQL is not using the maximum memory amount specified, we should consider lowering this amount.
Memory
Details
A description (cake) of the division of the physical memory usage of the SQL Server for the database, internal needs, and free memory in MB
DB
Memory
The memory used by the SQL Server instance to cache data and other objects related to specific databases.
Free MemoryThe amount of physical memory not utilized by SQL Server in MB.
A high value may indicate that the assigned memory to the SQL Server can be reduced.
Internal
Memory
The amount of physical memory which the SQL Server is utilizing for internal operations, not including operations for the database, in MB. For example: buffer pool, execution plans, system tables, procedures cache, and management.Read more
Memory (min)The minimum amount of assigned physical memory which the SQL can use in MB.
Memory(max)The maximum amount of assigned physical memory which the SQL can use in MB.
Temp table
creation/sec
Amount of temp table creations per second.
UptimeHow long the SQL server management studio has been up and running.
It’s recommended to have as high uptime value as possible on high-traffic instances.
Cluster active nameThe name of the active cluster in clustered instances.
Cluster nodes downThe amount of cluster nodes that are down. The server may be one of the cluster nodes.
Transactions/secNumber of open transactions per second.
Values higher than usual can cause system slowness.
Read more
Lazy writes/secMeasures the process of flushing modified data pages from memory (buffer cache) to disk (data files) per second. By deferring the immediate disk writes and batching them together, lazy writes help improve system performance. It reduces the frequency of disk I/O operations, minimizes disk access latency, and allows the system to perform multiple writes in a more efficient manner.
A high value indicates that the SQL server needs more memory and can affect other OS resources, such as disk IO and CPU usage.
Index Full scan/secAmount of indexes that were scanned per second. This is an alternative to a full table
scan when the index contains all the columns that are needed for the query, and at least one column in the index key has a “NOT FULL” constraint
Index page
splits/sec
Amount of indexes page split for second, affected by fragmentation. Page split describes a situation when
there’s no dedicated space for
updating/inserting value to the table, the split is to free space for the command to complete.
Logins/secThe number of logins to the SQL server per second. A high value may refer to security or application problems.
Logouts/secThe number of logouts to the SQL server per second. A high value may refer to connection problems.
Core availableTotal cores number in the server.
Core in useThe number of cores that are assigned for SQL Server use. The recommendation is that the SQL Server will use all cores.
Session Memory waitThe number of sessions that are waiting for free memory. Those queries don’t have enough RAM memory to start running, so they are “delayed.”
Create temp table/ variablesThe number of created temporary tables/ variables available. A high value can indicate unnecessary open connections.
TempDB free spaceTempdb database unused data space in KB. A high value may indicate unusual data growth.
Session avg. wait for signalThe average wait time in mili-seconds that the SQL Server reports that he’s in a wait.
The threshold leans on past activity and behavior. When the value is higher than average, it can cause SQL Server slow performance.
Read more
Session CPU waitThe number of queries that SQL Server reports as waiting for CPU availability.
The threshold leans on past activity and behavior. When the value is higher than average, it’s recommended to investigate and look for those queries.
Read more
Currently ActiveThe number of queries that are currently running (status is running; for oracle, status is active)
Currently BlockedThe number of queries that are currently blocked (status is suspended)Read more
Currently SleepingThe number of queries that are currently sleeping. A query that has been executed, and its results have been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Currently BackgroundThe number of queries that are running on the background. This separates from the main execution flow of a program or application in order to prevent blockings.
Currently Open TransactionsThe number of queries that have open transactions at the moment.Read more
Currently KilledThe number of queries that were killed at the last minute.Read more
Currently Avg Duration/secMeasures the average time taken to execute a single query or command in seconds.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Subscriber High latencyThe time it takes for changes made at the publisher to be replicated to the subscriber.Read more
Distributor High latencyThe time it takes for transactional changes generated at the publisher to be delivered to the distributor for further replication processing.Read more
LogReader High latencyThe time it takes for the LogReader agent to read the transaction log from the publisher and deliver the changes to the distributor for replication.Read more

— DB on Azure MSSQL

MetricDescriptionInvestigate this alert
StatusDatabase Status:
● Online – the database is available
● Offline – the database is not in use
● Mirror Disconnect – the sync is disconnected.
● Mirror Principal – the principal sync of all updating of the database.
● Mirror – the database is synchronized.
● Restoring – the database is currently being restored
● Suspect – the database is defective
Read more
InstanceThe SQL server instance name given in the installation.
DatabaseThe Database name
RecoveryThe recovery model determines the possible restore options specified for the database. It defines how the database transaction logs are managed and which data type can be recovered in case of a failure.
Full BackupThe date of the last Full Backup performed on the database.
The Full Backup documents are .bak files or snapshots.
A Full Backup once a day is the general recommendation.
Read more
Log BackupThe date of the last Log changes backup performed on the database.
The Log Backup documents are .trn files
A log backup once an hour is the general recommendation but if the recovery model is “simple” this value should be null.
Read more
MemoryThe amount of memory the database is taking up in the physical memory in MB.
SizeA description (cake) of the distribution of data and log file sizes occupying the disk storage, measured in MB. It is not recommended that the log takes up more than 60% of the database size. We should investigate the process integrity of this database, such as transactions (containing recursion) and backups. Bandwidth relates to the speed at which data can be transmitted between devices. A higher bandwidth allows for faster and more extensive data.
Disk IO/secAmount of disk reads and writes per second.
Usually, the main or biggest DBs will have a high value. Higher values than usual can indicate a performance problem of queries causing other queries to wait for free IO.
Data GrowthThe rate of information growth in the database on the disk storage in MB, which includes all the filegroups that contain the primary data file (.mdf).
A lack of space in the disk storage may indicate substantial data growth in the database.
Read more
Log GrowthThe rate of log growth in the database on the disk storage in MB.
A lack of space in the disk storage may indicate substantial log growth in the database.
Read more
In-MemoryKnown as In-Memory OLTP, a feature in SQL Server that leverages memory-resident tables and natively compiled stored procedures for a better performance in specific transactions.
Unused data spaceFree space of the DB – data that is not in use.
A value higher than 50% indicates that a shrink should be considered for the data growth.
At least 10% of the DB unused space for indexes and more is recommended.
CollationThe language and the manner of string comparison defined for the database.
Page VerifyPage Verify is a database option that defines the SQL Server mechanism of verifying page consistency when the page is written to disk and when it is read again from disk.
The recommendation is CHECKSUM.
DBCC last successLast successful Database check. Checks the database’s integrity, tables, indexes, schema, etc.
Running this test on a daily basis is very important for the proper functioning of the organization with the databases. The test runs on both a physical level and a logical level.
CompatibilityThe Compiler version at the Database level.
Diff BackupDate of the last differential backup performed for the Database.
In general, it’s recommended once a day but it depends on the full backup frequency.
Read more
TransactionsThe number of transaction operations UPDATE, INSERT, DELETE, BEGIN TRAN executed per second.
A high value (above average) may be the reason for slowness or log growth issues.
Log FlushThe time it takes to save the log found in the physical memory to the disk storage.
High values affect Transaction operations, Update, and saving to SQL times causing slowness.
File stream GrowthFile streams use of storage volume.
File streams enables the storage of large amount of data (more than data 2GB storage) such as large documents, images or files.
High values may cause storage problem to the data drive.
File stream DriveThe file stream’s drive.
IOThe amount of read and write operations from the disk storage at the sampled time.
A high value can cause slowness as a result of a load on the disk storage
Log sizeThe size assigned to the log files of the database in MB.
Log UseThe size of the log used in MB
Log FlushThe process of writing the contents of the transaction log buffer to the physical transaction log file on the disk, measured in milliseconds.
Higher time may increase the chance for data loss.
Log Reuse WaitA condition where the transaction log of a database is unable to reuse or truncate log space for reuse. NOTHING is a good value for it. REPLICATION is for a database in replication program.
Creation DateThe Database creation date.
Data FilesThe number of .mdf files the database contains (filegroups).
Log FilesThe number of .ldf files the database contains.
Disk Log IO/secThe number of logs input and output from the disk storage per second.
Data Read IO/secThe number of reads from the disk storage per second.
Data Write IO/secThe number of writes from the disk storage per second.
Open transactionsThe number of open transactions per second. A high number of open transactions can cause log oversize.Read more
Log transactionsThe log amount in MB while there’s an open transaction. An increased log growth can be caused when transactions don’t clean themselves while running.
Transaction DurationThe duration of the transaction.
Alwayson StateThe DB’s AlwaysOn state- can be synchronized or not synchronized. Not synchronized means that there’s a problem with the AlwaysOn.Read more
AlwaysOn StatusAlwaysOn Status- may be healthy or not healthy. Not healthy means that there’s a problem with the AlwaysOn.Read more
AlwaysOn graphIf it’s active- the value is 1. If it is not active -the value is 0.
AlwaysOn Log records not committed at SecondaryAmount of the AlwaysOn logs that couldn’t be committed yet from the primary to the secondary server.
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn Log records waiting to send to SecondaryAmount of the AlwaysOn logs waiting to be sent to the secondary server (from the primary server).
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn is Primary0 for secondary server databases, 1 for primary server databases
AlwaysOn group nameThe group that a DB in AlwaysON is related to.
The group can hold several databases in the enterprise edition.
Mirror StatusThe DB’s Mirror status- can be synchronized or not synchronized.
Not synchronized means that there’s a problem with the Mirror.
Read more
Mirror Status GraphIf it’s active- the value is 1. If not-the value is 0.
Mirror ModeThe DB’s mirror mode on the primary server is “principal,” and on the secondary server is “mirror.”Read more
Mirror Log records not committed at SecondaryAmount of the Mirror logs that couldn’t be committed yet from the primary to the secondary server.Read more
Mirror Log records waiting to send to SecondaryAmount of the Mirror logs waiting to be sent to the secondary server (from the primary server).
Data DriveThe drive where the data files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.
Log DriveThe drive where the log files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.

— Wait Stats on Azure MSSQL

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

-Amazon RDS

— Service

Metric

DescriptionInvestigate this alert
CloudThe Database name given in the Aimbetter configuration.
CPUThe overall percentage of time the CPU spends executing non-idle tasks.
Memory FreeThe amount of system memory currently available.
Free Storage SpaceThe free storage space in GB out of allocated storage.
Availability ZoneThe datacenter location within a specific AWS Region.
Receive Byte/secThe incoming data transfer rate in bytes per second.
Send Byte/secThe outgoing data transfer rate in bytes per second.
Read IOPSThe amount of read operations performed on a specific storage volume or disk during one-second interval.
Write IOPSThe amount of write operations performed on a specific storage volume or disk during one-second interval.
Read LatencyThe response time to a read request.
Read LatencyThe response time to a write request.
Availability ZoneThe location where the Amazon cloud computing resources are hosted for this Database.
Allocated StorageThe storage, in gibibytes, that is allocated for the DB instance.
DB Instance ClassDetermines the computation and memory capacity of an Amazon RDS DB instance.
Storage EncryptedOn a database instance running with Amazon RDS encryption, data stored is encrypted, as are its automated backups, read replicas, and snapshots.
Storage TypeAmazon RDS provides three storage types: General Purpose SSD (also known as gp2 and gp3), Provisioned IOPS SSD (also known as io1), and magnetic (also known as standard).

— MSSQL on Amazon RDS

MetricDescriptionInvestigate this alert
VersionThe SQL version, installed on the server
InstanceThe SQL server instance name given in the installation.
Test connectionA check of the time to establish a connection to the SQL server in milliseconds. A high value indicates that there are network communication problems or a load on the SQL server.Read more
Last RestartSQL server last restart
CollationIn SQL Server, a collation is a set of rules that determine how data is sorted and compared, for string based operations. SQL collations allow database administrators to define the appropriate rules for sorting and comparing strings based on the specific language and cultural context of the data being stored.
EditionThe installed SQL Server edition. There are numerous editions, and each edition has two runtimes – 32bit or 64bit—Ex: Express, Developer, Enterprise, etc.
SPThe Service Pack, which includes cumulative updates of all the fixes and improvements from previous service packs and cumulative updates for a specific version of SQL Server.
Page life expectancyThe time SQL keeps the retrieved information from the server’s physical memory in seconds. Low values indicate that the SQL is exchanging the information that arrives from the physical memory at a high frequency and needs more physical memory in order to perform faster.Read more
User
Connections
A connection established between a client application and a database server using SQL credentials is considered a single user connection. A large number may indicate a load on the system, a fault, or a security error.Read more
Connection reuse/secThe total number of logins started from the connection pool per second. Apps tend to open and close connections repeatedly – this value indicates the amount of the connections’ reuse.
Batch requests/secThe number of updates, retrievals, deletions, or saving operations in the SQL per second.
This metric enables the user to detect abnormalities in the operations amount on the SQL server.
Read more
Buffer cache hit ratioThe percentage of memory requests that are satisfied from the cache (physical memory of the SQL server).
Values below 90% indicate multiple reads/writes from/to the main memory or disk storage.
You should investigate whether there is a high physical memory consumption by different programs or processes and consider the need to add physical memory to the SQL server.
Read more
Page reads/secThe amount of Page reads (each page weighs 8Kb) from the disk per second.
Many reads indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
Page writes/secThe amount of Page writes (each page weighs 8Kb) to the disk per second.
Many writes indicate that we should examine the SQL server’s integrity, indexing, and system query logic.
SP CompilationThe number of times the SQL compiles the running programs of the queries per second.
A large amount of program compilation along with a small number of Batch requests indicates large usage of direct queries, p_executesql, and no procedures with determined variables.
SP Re CompilationThe number of times the SQL recompiles the running programs of the queries per second.
A large amount of program recompilation, combined with a small number of Batch requests, indicates that the request retrieves have grown, a statistical update has been performed, or the indexing has been recompiled.
We should investigate the amount of information and whether or not the other operations have been performed.
Page
Lookups
The number of times SQL seeks pages (each page size is 8Kb) from the physical
memory.
(Page lookups/sec) / (Batch requests/sec) greater than 100 indicate that some queries are not running optimally.
Latches TimesThe duration in seconds for which a thread holds exclusive access (“latch”) to a shared resource (for ex. “latched table”).
A high amount of latches causes slowness in data reception from the latched tables. We should investigate a change in the Update or Deletion method.
Page Splits/secThe number of pages per second splitting for allocation purposes in the event that the index does not have space at the frequency of a second.
An amount higher than 20 per second requires a check of the index specifications.
Checkpoint
Pages/sec
The data pages that are written per second to disk during a checkpoint operation. A checkpoint is a process in which the SQL Server ensures that all modified data pages in memory are flushed and written to disk to maintain data consistency and durability.
DB IO/secThe amount of reads and writes of the entire database per second
Target MemoryThe target RAM memory limit that the SQL Server is allowed to consume and utilize for its internal operations.
MemoryThe amount of memory SQL Server is utilizing in MB. If SQL is not using the maximum memory amount specified, we should consider lowering this amount.
Memory
Details
A description (cake) of the division of the physical memory usage of the SQL Server for the database, internal needs, and free memory in MB
DB
Memory
The memory used by the SQL Server instance to cache data and other objects related to specific databases.
Free MemoryThe amount of physical memory not utilized by SQL Server in MB.
A high value may indicate that the assigned memory to the SQL Server can be reduced.
Internal
Memory
The amount of physical memory which the SQL Server is utilizing for internal operations, not including operations for the database, in MB. For example: buffer pool, execution plans, system tables, procedures cache, and management.Read more
Memory (min)The minimum amount of assigned physical memory which the SQL can use in MB.
Memory(max)The maximum amount of assigned physical memory which the SQL can use in MB.
Temp table
creation/sec
Amount of temp table creations per second.
UptimeHow long the SQL server management studio has been up and running.
It’s recommended to have as high uptime value as possible on high-traffic instances.
Cluster active nameThe name of the active cluster in clustered instances.
Cluster nodes downThe amount of cluster nodes that are down. The server may be one of the cluster nodes.
Transactions/secNumber of open transactions per second.
Values higher than usual can cause system slowness.
Read more
Lazy writes/secMeasures the process of flushing modified data pages from memory (buffer cache) to disk (data files) per second. By deferring the immediate disk writes and batching them together, lazy writes help improve system performance. It reduces the frequency of disk I/O operations, minimizes disk access latency, and allows the system to perform multiple writes in a more efficient manner.
A high value indicates that the SQL server needs more memory and can affect other OS resources, such as disk IO and CPU usage.
Index Full scan/secAmount of indexes that were scanned per second. This is an alternative to a full table
scan when the index contains all the columns that are needed for the query, and at least one column in the index key has a “NOT FULL” constraint
Index page
splits/sec
Amount of indexes page split for second, affected by fragmentation. Page split describes a situation when
there’s no dedicated space for
updating/inserting value to the table, the split is to free space for the command to complete.
Logins/secThe number of logins to the SQL server per second. A high value may refer to security or application problems.
Logouts/secThe number of logouts to the SQL server per second. A high value may refer to connection problems.
Core availableTotal cores number in the server.
Core in useThe number of cores that are assigned for SQL Server use. The recommendation is that the SQL Server will use all cores.
Session Memory waitThe number of sessions that are waiting for free memory. Those queries don’t have enough RAM memory to start running, so they are “delayed.”
Create temp table/ variablesThe number of created temporary tables/ variables available. A high value can indicate unnecessary open connections.
TempDB free spaceTempdb database unused data space in KB. A high value may indicate unusual data growth.
Session avg. wait for signalThe average wait time in mili-seconds that the SQL Server reports that he’s in a wait.
The threshold leans on past activity and behavior. When the value is higher than average, it can cause SQL Server slow performance.
Read more
Session CPU waitThe number of queries that SQL Server reports as waiting for CPU availability.
The threshold leans on past activity and behavior. When the value is higher than average, it’s recommended to investigate and look for those queries.
Read more
Currently ActiveThe number of queries that are currently running (status is running; for oracle, status is active)
Currently BlockedThe number of queries that are currently blocked (status is suspended)Read more
Currently SleepingThe number of queries that are currently sleeping. A query that has been executed, and its results have been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Currently BackgroundThe number of queries that are running on the background. This separates from the main execution flow of a program or application in order to prevent blockings.
Currently Open TransactionsThe number of queries that have open transactions at the moment.Read more
Currently KilledThe number of queries that were killed at the last minute.Read more
Currently Avg Duration/secMeasures the average time taken to execute a single query or command in seconds.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Subscriber High latencyThe time it takes for changes made at the publisher to be replicated to the subscriber.Read more
Distributor High latencyThe time it takes for transactional changes generated at the publisher to be delivered to the distributor for further replication processing.Read more
LogReader High latencyThe time it takes for the LogReader agent to read the transaction log from the publisher and deliver the changes to the distributor for replication.Read more

— Oracle on Amazon RDS

MetricDescriptionInvestigate this alert
DatabaseThe Database name.
EditionThe Database edition.
32/64The Database runtime – 32bit or 64bit.
VersionThe Database version.
Log ModeRefers to the Database redo logs management, used for data integrity or recovery in a case of disaster.
There are several types:
ARCHIVELOG Mode- allows to create backups that capture changes made to the database since the last backup
NONARCHIVELOG Mode- limits the ability to perform point-in-time recovery since archived redo logs are not available
FORCE LOGGING Mode- all data changes made to the database are logged to the redo log files, even for operations that would not typically generate redo logs
National Language (NLS)Refers to a set of features and settings that allow Oracle Database to handle multiple languages.
Patch LevelThe version number of the Oracle software and the cumulative updates and releases.
Last RestartLast restart in the format date:hour:minute
Test ConnectionA check of the time to establish a connection to the Database in milliseconds. A high value indicates that there are network communication problems or a load on the Oracle Database.Read more
Session LimitThe utlized number of user sessions connected to the database at the moment, out of the maximum sessions allowed, in percentage.
Session (Max)The maximum number of concurrent user sessions allowed to connect to the database. Each user session represents a connection with the database.
Processes (Max)The maximum number of concurrent user processes allowed to connect to the database.
Default block sizeThe standard size of a data block used for storing data and managing database objects within the database’s data files. As of Oracle Database 12c, the default block size is typically 8192 bytes (8.19 KB) for a general-purpose database.
Open Cursors (max)The maximum possible open cursors in the database. It is a programmatic handle or pointer used by the database to access or process the results of queries or DML statements. It is essential for developers to explicitly close cursors after they are no longer needed.
DR Last Sync DateThe last synchronization date and time for a Data Guard configuration (high-availability and disaster recovery solution that allows to maintain standby databases synchronized with the primary database).Read more
Physical ReadsThe amount of reads of the entire database measured in blocks as defind in “Default Block Size”(defauly is 8KB).
Physical WritesThe amount of writes of the entire database measured in blocks as defind in “Default Block Size”(defauly is 8KB).
DR Full BackupThe last date and time of a full backup taken from the primary database and used to initialize or restore a standby database in a Data Guard configuration.Read more
Archive Log BackupThe last date and time of a backup operation that specifically targets the archived redo logs.Read more
CTL SP File BackupThe last date and time of a backup of the “control file and server parameter (SP) file.” It helps to ensure the recoverability of the database in case of disasters, media failures, or user errors.Read more
Avg. ThreadsThe average amount of concurrent executions of multiple tasks or processes, such as: Operating System threads, Java threads, Database sessions, or parallel query executions. Calculated as the amount of active sessions / CPU cores (in percentage).
Database CPU TimeThe remaining CPU time in percentage to the execution of SQL statements by the Oracle database processes and other database-related operations. Higher values mean less waits for CPU improving performance. Best practices include minimum scans and hold-ons in queries executions.
Buffer Cache HitThe percentage of a requested data block found in the database buffer cache, thereby avoiding the need to read the block from disk.Read more
PGA Cache HitThe percentage of times process data requests are found in the Global Area (PGA) cache allocation, without a need for additional memory or read from disk. The higher this value, the more efficient this database is.Read more
DeadlocksThe amount of deadlocks in the database server.Read more
Invalid ObjectThe amount of database objects that are currently in an invalid state.
Redo Entries (rows update)The amount of records that capture changes made to the database, related to redo log. When this value is higher than usual, it may indicate a possible cause for slowness in performance.
Query ExecuteThe amount of queries executed at the moment.
Avg. SessionsThe average amount of sessions at the moment.
Avg. ActiveThe average amount of active sessions at the moment.
Avg. BlockingThe average amount of blocking sessions at the moment.Read more
Avg. Sleeping BlockingThe average amount of both sleeping and blocking sessions at the moment.
Avg. BlockedThe average amount of blocked sessions at the moment.Read more
Avg. Open TransactionThe average amount of open transactions at the moment.Read more
Avg. SleepingThe average amount of sleeping sessions at the moment. A query that has been executed, and its result has been returned to the client application, but the connection to the database is still open and waiting for further instructions or actions from the client.
Avg. BackgroundThe average amount of background sessions at the moment. This separates from the main execution flow of a program or application in order to prevent blockings.
Avg. DurationA count of the average duration in seconds of all queries running at the moment in different sessions.
Number of Queries 0-9.99A count of queries running up to 10 seconds.Read more
Number of Queries 10-19.99A count of queries running between 10 to 20 seconds.Read more
Number of Queries 20-29.99A count of queries running between 20 to 30 seconds.Read more
Number of Queries 30-59.99A count of queries running between 30 to 60 seconds.Read more
Number of Queries over 60A count of queries running over 60 seconds.Read more
Archive Logs RetentionThe retention period for archived redo log files.
Log SwitchThe count of log file switch completion which is when the database switches from writing redo log entries from one redo log group (also known as a redo log file) to another.

— DB on Amazon RDS MSSQL

MetricDescriptionInvestigate this alert
StatusDatabase Status:
● Online – the database is available
● Offline – the database is not in use
● Mirror Disconnect – the sync is disconnected.
● Mirror Principal – the principal sync of all updating of the database.
● Mirror – the database is synchronized.
● Restoring – the database is currently being restored
● Suspect – the database is defective
Read more
InstanceThe SQL server instance name given in the installation.
DatabaseThe Database name
RecoveryThe recovery model determines the possible restore options specified for the database. It defines how the database transaction logs are managed and which data type can be recovered in case of a failure.
Full BackupThe date of the last Full Backup performed on the database.
The Full Backup documents are .bak files or snapshots.
A Full Backup once a day is the general recommendation.
Read more
Log BackupThe date of the last Log changes backup performed on the database.
The Log Backup documents are .trn files
A log backup once an hour is the general recommendation but if the recovery model is “simple” this value should be null.
Read more
MemoryThe amount of memory the database is taking up in the physical memory in MB.
SizeA description (cake) of the distribution of data and log file sizes occupying the disk storage, measured in MB. It is not recommended that the log takes up more than 60% of the database size. We should investigate the process integrity of this database, such as transactions (containing recursion) and backups. Bandwidth relates to the speed at which data can be transmitted between devices. A higher bandwidth allows for faster and more extensive data.
Disk IO/secAmount of disk reads and writes per second.
Usually, the main or biggest DBs will have a high value. Higher values than usual can indicate a performance problem of queries causing other queries to wait for free IO.
Data GrowthThe rate of information growth in the database on the disk storage in MB, which includes all the filegroups that contain the primary data file (.mdf).
A lack of space in the disk storage may indicate substantial data growth in the database.
Read more
Log GrowthThe rate of log growth in the database on the disk storage in MB.
A lack of space in the disk storage may indicate substantial log growth in the database.
Read more
In-MemoryKnown as In-Memory OLTP, a feature in SQL Server that leverages memory-resident tables and natively compiled stored procedures for a better performance in specific transactions.
Unused data spaceFree space of the DB – data that is not in use.
A value higher than 50% indicates that a shrink should be considered for the data growth.
At least 10% of the DB unused space for indexes and more is recommended.
CollationThe language and the manner of string comparison defined for the database.
Page VerifyPage Verify is a database option that defines the SQL Server mechanism of verifying page consistency when the page is written to disk and when it is read again from disk.
The recommendation is CHECKSUM.
DBCC last successLast successful Database check. Checks the database’s integrity, tables, indexes, schema, etc.
Running this test on a daily basis is very important for the proper functioning of the organization with the databases. The test runs on both a physical level and a logical level.
CompatibilityThe Compiler version at the Database level.
Diff BackupDate of the last differential backup performed for the Database.
In general, it’s recommended once a day but it depends on the full backup frequency.
Read more
TransactionsThe number of transaction operations UPDATE, INSERT, DELETE, BEGIN TRAN executed per second.
A high value (above average) may be the reason for slowness or log growth issues.
Log FlushThe time it takes to save the log found in the physical memory to the disk storage.
High values affect Transaction operations, Update, and saving to SQL times causing slowness.
File stream GrowthFile streams use of storage volume.
File streams enables the storage of large amount of data (more than data 2GB storage) such as large documents, images or files.
High values may cause storage problem to the data drive.
File stream DriveThe file stream’s drive.
IOThe amount of read and write operations from the disk storage at the sampled time.
A high value can cause slowness as a result of a load on the disk storage
Log sizeThe size assigned to the log files of the database in MB.
Log UseThe size of the log used in MB
Log FlushThe process of writing the contents of the transaction log buffer to the physical transaction log file on the disk, measured in milliseconds.
Higher time may increase the chance for data loss.
Log Reuse WaitA condition where the transaction log of a database is unable to reuse or truncate log space for reuse. NOTHING is a good value for it. REPLICATION is for a database in replication program.
Creation DateThe Database creation date.
Data FilesThe number of .mdf files the database contains (filegroups).
Log FilesThe number of .ldf files the database contains.
Disk Log IO/secThe number of logs input and output from the disk storage per second.
Data Read IO/secThe number of reads from the disk storage per second.
Data Write IO/secThe number of writes from the disk storage per second.
Open transactionsThe number of open transactions per second. A high number of open transactions can cause log oversize.Read more
Log transactionsThe log amount in MB while there’s an open transaction. An increased log growth can be caused when transactions don’t clean themselves while running.
Transaction DurationThe duration of the transaction.
Alwayson StateThe DB’s AlwaysOn state- can be synchronized or not synchronized. Not synchronized means that there’s a problem with the AlwaysOn.Read more
AlwaysOn StatusAlwaysOn Status- may be healthy or not healthy. Not healthy means that there’s a problem with the AlwaysOn.Read more
AlwaysOn graphIf it’s active- the value is 1. If it is not active -the value is 0.
AlwaysOn Log records not committed at SecondaryAmount of the AlwaysOn logs that couldn’t be committed yet from the primary to the secondary server.
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn Log records waiting to send to SecondaryAmount of the AlwaysOn logs waiting to be sent to the secondary server (from the primary server).
When these graphs are active, this is the secondary group.; if there’s no data- it’s the primary group.
AlwaysOn is Primary0 for secondary server databases, 1 for primary server databases
AlwaysOn group nameThe group that a DB in AlwaysON is related to.
The group can hold several databases in the enterprise edition.
Mirror StatusThe DB’s Mirror status- can be synchronized or not synchronized.
Not synchronized means that there’s a problem with the Mirror.
Read more
Mirror Status GraphIf it’s active- the value is 1. If not-the value is 0.
Mirror ModeThe DB’s mirror mode on the primary server is “principal,” and on the secondary server is “mirror.”Read more
Mirror Log records not committed at SecondaryAmount of the Mirror logs that couldn’t be committed yet from the primary to the secondary server.Read more
Mirror Log records waiting to send to SecondaryAmount of the Mirror logs waiting to be sent to the secondary server (from the primary server).
Data DriveThe drive where the data files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.
Log DriveThe drive where the log files of the DB are located. It’s recommended that there’ll be a separation between the data, logs, and tempdb files.

— Wait Stats on Amazon RDS MSSQL

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

— Wait Stats on Amazon RDS Oracle

MetricDescriptionInvestigate this alert
Wait TypeThe SQL Server wait type name.
Wait (%)The percentage of the wait time compared to other waits.
If the value is higher than average, there’s a wait for the specific resource, which can be caused by a specific delayed/long-running query/ies.
Avg Wait (ms)The average wait stats in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
Wait (ms)The wait time in mili-seconds.
If the value is higher than average, a bottleneck in a specific resource can be caused by delayed/long-running queries.
Read more
TasksThe number of tasks waiting for the wait type at the moment.

Queries Module

— Live/ History

MetricDescriptionInvestigate this alert
SessionDisplays the login name (SPID)
Start TimeThe time when the execution of the query has started in the database.
DurationFor how long the query has been running, shown in Hours:Minutes:Seconds format.
Max DurationIn the case of blocking queries, it displays the highest execution time from the group of blocked queries. It adds the time the blocked query has been running before it got locked with the blocking query duration.
NotesNotes about problems: missing indexes, non-optimal query plan, OS resources issue, antivirus application running, and more.
BlocksThe amount of queries that are currently blocked by the query.
Open TranThe current amount of open transactions for this query.
ClientThe server from which the query originates, by application level.
DBFor MSSQL, it displays the name of the database on which the query is being executing. For Oracle, it displays the database ID.
Instance IDFor Oracle, it displays the ID of the Oracle instance.
Appthe application from which the query originates.
Process IDFor MSSQL, it is the process id as identified in the client.
LoginThe login name.
OS LoginFor Oracle, it displays the client login from which the query is running.
SQL IDFor Oracle, it displays the query ID.
Total RowsThe cummulative rows count of the same Query_ID (SQL_ID) since the last calculations for the query’s execution plan and statistics.
CommandFor a Live query, it is the command being executed at the moment. For a History query, it is the last command executed. For example: SELECT, CHECKPOINT, AWAITING COMMAND, INSERT, UPDATE, DELETE, EXECUTE.
StatusFor a Live query, it is the current status. For a History query, it is the last query status executed. For MSSQL Server: suspended, running, runable, active, inactive. For Oracle: active, inactive.
Wait ResourceA specific type of wait event that occurs when a query or transaction is waiting for access to a resource.
For MSSQL it indicates the specific data page that the query or transaction is waiting to access and is represented in the format database_id:file_id:page_id.
For Oracle, it displays the instance name (spid) and the specific wait resource: cluster, network,user I/O, other.
Last WaitThe last wait stats identified. The options are based on the wait stats available in Performance module.
Disk I/OSummarizes all the Disk I/O consumed by the query since it statrted running (might be in Bit, KB, MB, GB, TB).
Cache ReadsSummarizes the total RAM memory consumed by the query (MB,GB,TB).
CPUThe query’s CPU usage time (sec, min).
tempdbThe tempdb data growth caused by the query. We recommend separating its file’s drive from other data files in order to avoid storage space pressure in the event of extensive data growth.
tempdb logThe tempdb log data growth caused by the query.
DB logThe log growth of the database originated by the query execution. A high value might cause a log over-sizing and log drive space pressure.
Plan Compile TimeThe time taken by the system to generate and optimize the execution plan for an SQL query. If it exceeds some milliseconds, you should consider improving the query’s execution plan.
Transaction IsolationDisplays the varying degrees of isolation and concurrency control. For example: read committed, read uncommitted, repeatable read, serializable.
ExecutionsThe cummulative executions count of the same Query_ID (SQL_ID in Oracle) since the last calculations for the query’s execution plan and statistics.
Avg RowsThe output of Total Rows / Executions.
Avg SecThe output of Total Seconds / Executions.
Total SecondsThe cummulative seconds of the same Query_ID (SQL_ID) since the last calculations for the query’s execution plan and statistics.
Host Process IDThe dedicated process or thread running on the server responsible for the query execution.
Client ProcessThe the process path as recognized on the client and the client’s current CPU Usage by percentage.

— Filters in Live / History

FilterDescriptionInvestigate this alert
Long or Blocked SessionDisplays only long sessions or blocked sessions.
Missing IndexChoose ‘exists’ to display sessions that have ‘missing indexes’ recommendation.
Plan ImprovementChoose ‘exists’ to display sessions that have ‘plan improvement’ recommendation to the query’s execution plan.
Max DurationSet the time to display sessions longer or shorter than this time.
Query Hash/IDFilters by a specific query id.
Disk I/OSet the Disk I/O volume to display sessions that transmitted more or less than this volume.
CPUDisplay sessions that consumed more or less than CPU time.
QueriesDisplay queries that contains a specific string (text), the text should be less than 100 chars.
Wait StatsSelect the Wait Stats categories to display sessions that are included in these categories.
Query AlertsChoose “exists” to display sessions that had some related alert.
OS UserFor Oracle, filter by the OS user name.

— QAnalyze

MetricDescriptionInvestigate this alert
Max Query CPUCalculates the maximum query’s CPU utilization percentage at a selected time frame.
Execution CountCounts the amount of executions of a specific query code.
AVG. DurationCalculates the average duration of a query in a selected time range.
AVG. Disk I/OCalculates average Disk I/O of a query in a selected time range.
AVG. Cache UsageCalculates the average cache reads of a query in a selected time range.
Total CPU TimeSummarizes the total CPU time of all the query’s executions in a selected time range.
Total DurationSummarizes the total duration time of all the query’s executions in a selected time range.
Total Disk UsageSummarizes the total Disk I/O of all the query’s executions in a selected time range.
Total Cache UsageSummarizes the total cache reads from all the query’s executions in a selected time range.

— Filters in QAnalyze

FilterDescriptionInvestigate this alert
Choose ‘exists’ to display sessions that have queries with index recommendation.
Plan ImproveChoose ‘exists’ to display sessions that have queries with plan improvement recommendation.
Max Query CPUChoose “more than” or “less than” a selected maximum CPU utilization in percentage.
Avg. SQL CPUChoose “more than” or “less than” a selected average CPU utilization in percentage.
Execution CountChoose “more than” or “less than” a selected amount of executions.
Avg. DurationChoose “more than” or “less than” a selected average query’s duration in seconds or minutes.
Avg. Cache UsageChoose “more than” or “less than” a selected average cache reads in bytes, KB, MB or GB.

Observer Module

-Windows

— Change Tracking

ChangeDescriptionInvestigate this alert
Computer NameInforms of a change in the computer’s name.
CPU CoresInforms of a change in the number of cores.
CPU SpecificationsInforms of a change in the CPU specifications: manufacturer/speed/model.
Firewall ProfileInforms of a change in the Windows Firewall general profile: Domain, Private, Public.
Last RestartInforms of a restart (reboot) and its date.
ManufacturerInforms of a change in the machine manufacturer.
Total MemoryInforms of a change in the total memory.
Operating SystemInforms of a change in the operating system version.
SPInforms of a change in the operating system’s service pack (SP).
Windows Update DateInforms of a Windows Update and its date.
Software Installation DateInforms of a software installation or update and its date.
Paging MaxInforms of a change in the maximum size set for a Pagefile.
Paging MinInforms of a change in the minimum size set for a Pagefile.
Network BandwidthInforms of a change in the network bandwidth of a card.
Service Account NameInforms of a change in the account name of a service.
Service PathInforms of a change in the path of a service.
Service Start ModeInforms of a change in the start mode of a service: Automatic, Automatic (Delayed Start), Manual, Disabled.
Service StateInforms of a change in the state of a service: Running, Paused, Stopped.
Total DiskInforms of a change in a disk’s total capacity.

-MSSQL on Windows

— Change Tracking

ChangeDescriptionInvestigate this alert
CollationInforms of a change in the SQL Server Collation.
EditionInforms of a change in the SQL Server Edition.
VersionInforms of a change in the SQL Server Version.
Last RestartInforms of a restart of the SQL Server instance and its date.
Cores AvailableInforms of a change in the number of available logical (virtual) cores for SQL Server.
Cores In UseInforms of a change in the number of cores in use by SQL Server.
Cluster Active NameInforms of a change in the name of the active node in a clustered instance.
SP or CUInforms of a change in the SQL Server service package (SP) or cumulative update (CU). For each version, check the latest SP or CU recommended.
AlwaysOn Backup PreferenceInforms of a change in a database AlwaysOn Backup preference.
AlwaysOn Group NameInforms of a change in a database AlwaysOn group name.
AlwaysOn HealthInforms of a change in a database AlwaysOn health status.
AlwaysOn StateInforms of a change in a database AlwaysOn state: Not Synchronized,Synchronized.
Auto CloseInforms of a change in a database Auto Close status: enabled/ disabled.
Auto Create StatisticsInforms of a change in a database Auto Create Statistics status: enabled/ disabled.
Auto Update StatisticsInforms of a change in a database Auto Update Statistics status: enabled/ disabled.
Auto ShrinkInforms of a change in a database Auto Shrink status: enabled/ disabled.
Database Compatibility LevelInforms of a change in a database Compatibility level.
Database Creation DateInforms of a new database or a change in a database creation date.
Database Data DriveInforms of a change in a database data drive path.
Database File Stream DriveInforms of a change in a database file stream drive path.
Database Log DriveInforms of a change in a database log drive path.

-MSSQL on Linux

— Change Tracking

ChangeDescriptionInvestigate this alert
CollationInforms of a change in the SQL Server Collation.
EditionInforms of a change in the SQL Server Edition.
VersionInforms of a change in the SQL Server Version.
Last RestartInforms of a restart of the SQL Server instance and its date.
Cores AvailableInforms of a change in the number of available logical (virtual) cores for SQL Server.
Cores In UseInforms of a change in the number of cores in use by SQL Server.
Cluster Active NameInforms of a change in the name of the active node in a clustered instance.
SP or CUInforms of a change in the SQL Server service package (SP) or cumulative update (CU). For each version, check the latest SP or CU recommended.
AlwaysOn Backup PreferenceInforms of a change in a database AlwaysOn Backup preference.
AlwaysOn Group NameInforms of a change in a database AlwaysOn group name.
AlwaysOn HealthInforms of a change in a database AlwaysOn health status.
AlwaysOn StateInforms of a change in a database AlwaysOn state: Not Synchronized,Synchronized.
Auto CloseInforms of a change in a database Auto Close status: enabled/ disabled.
Auto Create StatisticsInforms of a change in a database Auto Create Statistics status: enabled/ disabled.
Auto Update StatisticsInforms of a change in a database Auto Update Statistics status: enabled/ disabled.
Auto ShrinkInforms of a change in a database Auto Shrink status: enabled/ disabled.
Database Compatibility LevelInforms of a change in a database Compatibility level.
Database Creation DateInforms of a new database or a change in a database creation date.
Database Data DriveInforms of a change in a database data drive path.
Database File Stream DriveInforms of a change in a database file stream drive path.
Database Log DriveInforms of a change in a database log drive path.

-MSSQL on Azure

— Change Tracking

ChangeDescriptionInvestigate this alert
CapacityInforms of a change in the Azure SQL Database or Managed Instance capacity (related to the selected pricing model).
Virtual CoreInforms of a change of virtual cores available in the Azure SQL Database or Managed Instance.
CollationInforms of a change in the SQL Server Collation.
EditionInforms of a change in the SQL Server Edition.
VersionInforms of a change in the SQL Server Version.
Last RestartInforms of a restart of the SQL Server instance and its date.
Cores AvailableInforms of a change in the number of available logical (virtual) cores for SQL Server.
Cores In UseInforms of a change in the number of cores in use by SQL Server.
Cluster Active NameInforms of a change in the name of the active node in a clustered instance.
SP or CUInforms of a change in the SQL Server service package (SP) or cumulative update (CU). For each version, check the latest SP or CU recommended.
AlwaysOn Backup PreferenceInforms of a change in a database AlwaysOn Backup preference.
AlwaysOn Group NameInforms of a change in a database AlwaysOn group name.
AlwaysOn HealthInforms of a change in a database AlwaysOn health status.
AlwaysOn StateInforms of a change in a database AlwaysOn state: Not Synchronized,Synchronized.
Auto CloseInforms of a change in a database Auto Close status: enabled/ disabled.
Auto Create StatisticsInforms of a change in a database Auto Create Statistics status: enabled/ disabled.
Auto Update StatisticsInforms of a change in a database Auto Update Statistics status: enabled/ disabled.
Auto ShrinkInforms of a change in a database Auto Shrink status: enabled/ disabled.
Database Compatibility LevelInforms of a change in a database Compatibility level.
Database Creation DateInforms of a new database or a change in a database creation date.
Database Data DriveInforms of a change in a database data drive path.
Database File Stream DriveInforms of a change in a database file stream drive path.
Database Log DriveInforms of a change in a database log drive path.

-MSSQL on Amazon RDS

— Change Tracking

ChangeDescriptionInvestigate this alert
CollationInforms of a change in the SQL Server Collation.
EditionInforms of a change in the SQL Server Edition.
VersionInforms of a change in the SQL Server Version.
Last RestartInforms of a restart of the SQL Server instance and its date.
Cores AvailableInforms of a change in the number of available logical (virtual) cores for SQL Server.
Cores In UseInforms of a change in the number of cores in use by SQL Server.
Cluster Active NameInforms of a change in the name of the active node in a clustered instance.
SP or CUInforms of a change in the SQL Server service package (SP) or cumulative update (CU). For each version, check the latest SP or CU recommended.
AlwaysOn Backup PreferenceInforms of a change in a database AlwaysOn Backup preference.
AlwaysOn Group NameInforms of a change in a database AlwaysOn group name.
AlwaysOn HealthInforms of a change in a database AlwaysOn health status.
AlwaysOn StateInforms of a change in a database AlwaysOn state: Not Synchronized,Synchronized.
Auto CloseInforms of a change in a database Auto Close status: enabled/ disabled.
Auto Create StatisticsInforms of a change in a database Auto Create Statistics status: enabled/ disabled.
Auto Update StatisticsInforms of a change in a database Auto Update Statistics status: enabled/ disabled.
Auto ShrinkInforms of a change in a database Auto Shrink status: enabled/ disabled.
Database Compatibility LevelInforms of a change in a database Compatibility level.
Database Creation DateInforms of a new database or a change in a database creation date.
Database Data DriveInforms of a change in a database data drive path.
Database File Stream DriveInforms of a change in a database file stream drive path.
Database Log DriveInforms of a change in a database log drive path.

Web Module

MetricDescriptionInvestigate this alert
Client IPThe machine or server from which the request was sent.
URLThe site the client is trying to access.
In addition to the website name, parameters, and error messages are displayed.
MethodThe method of the sent request- can be GET or POST.
Start timeThe time when the request was sent.
DurationThe time it took until a reply was received after sending a request.
It doesn’t include the render time, which depends on other factor like connectivity quality
(not related to the web server itself).
Above a few seconds indicate a slowness problem.
StatusThe status of the sent request.
There are five types or statuses which we can filter:
200 OK
304 Not Modified
404 Not Found
500 Internal Server Error
Both statuses 404 and 500 indicate a serious problem.
HostThe server where the software is installed on.
Web Site NameThe site we are trying to reach.
In addition to the website name, parameters, and error messages are displayed.
Server IPHost IP.
PortThe port from where the request is sent.

Connections Module

— Files

MetricDescriptionInvestigate this alert
NameThe file display name given in the AimBetter Configuration.
PathThe file path. It can be a local or remote path.
TypeFile or Folder.
PathThe file path. It can be a local or remote path.
Duration (ms)The time it takes to reach the file in milliseconds.
SizeThe size of the file or folder in MB.
FoldersThe number of folders (if there are).
FilesThe number of files (if there are).
NotesDescribes the specific error with file/folder access.

— Web

MetricDescriptionInvestigate this alert
NameThe website display name given in the AimBetter Configuration.
URLThe URL of the website.
StatusThe status the website. Can be OK / Not Modified / Not Found / Internal Server Error
Status CodeThe status code of the website: 200, 304, 404, 500.
Round Trip (ms)The time in milliseconds it takes to load the website (not including the render time).
NotesDescribes the specific error with website access.
SSL Expired DaysThe number of days remaining until the SSL certificate expires.

— DB Connection

MetricDescriptionInvestigate this alert
NameThe DB instance display name given in the AimBetter Configuration.
Duration (ms)The time it takes to establish a connection to the DB in milliseconds.
SuccessWhether the connection succeeded or not
NotesDescribes the specific error with DB connection.

— Ping

MetricDescriptionInvestigate this alert
NameThe Ping display name given in the AimBetter Configuration.
Server’s IPThe IP of the server the Ping is being sent to.
Ping Lost Packets (0-12)The amount of unsuccessful communication integrity checks out of 12 attempts.Read more
Network JitterThe variation in the delay of packet delivery during all 12 communication integrity checks.Read more
Network LatencyThe time taken for a packet to travel from its source to its destination, including the time spent in transit and any processing delays along the way.Read more
NotesDescribes the specific error with DB connection.

— Query

MetricDescriptionInvestigate this alert
NameThe Query display name given in the AimBetter Configuration.
Table/ViewTable or View name from which the query is done.
ColumnThe column name in the query.
Value NumThe output of the query in rows num, enabling to set an alert if it crosses a specific value.
ValueWhich value the query belongs to.
NotesDescribes the specific exception with the execution of the query.

Security Risk Alerts

The following alerts pose security risks.

— Windows

RiskDescriptionInvestigate this alert
Windows RestartA malicious party could initiate a Windows Server restart to exploit the brief downtime, potentially bypassing security controls or launching an attack while the system is rebooting and its defenses are not fully active.
Windows UpdateA malicious party could exploit an update process to introduce vulnerabilities, potentially delaying critical patches or injecting malware, thereby gaining unauthorized access or compromising system integrity.
Change in Firewall ProfileMay expose the network to unauthorized access or attacks by altering the security rules, potentially allowing malicious traffic through previously blocked ports or protocols.
Change in a Service Account NameA change in the service account name can disrupt service operations, potentially causing authentication failures and creating opportunities for unauthorized access or privilege escalation if not properly managed.
Software Installation or UpdateA software installation or update could introduce new vulnerabilities, compromise system stability, or inadvertently install malicious software, leaving the system exposed to attacks.
Remote AccessRemote access programs like AnyDesk, TeamViewer and others could allow unauthorized users to gain control of the server, potentially leading to data breaches, system manipulation, or further malicious activities.Read more
Suspicious ProcessSuspicious programs can pose a risk to the company like malicious software, potentially leading to data breaches, system manipulation, or further malicious activities.Read more
File Not FoundIf a sensitive file is deleted or has its path changed, it could indicate unauthorized access or malicious activity, potentially leading to data breaches, loss of critical information, or system compromise.Read more

— MSSQL

RiskDescriptionInvestigate this alert
MSSQL RestartMay temporarily expose the server to attacks or unauthorized access during the reboot process, especially if security mechanisms and connections are not fully restored or properly secured.
Change in MSSQL Node NameCan create security risks by potentially disrupting access controls and authentication processes, which could lead to unauthorized access or compromise data integrity if not correctly managed.
Cluster Node DownReduce the redundancy and failover capability of the system, potentially making the server more vulnerable to attacks, data loss, or service disruption during the downtime.
Change in MSSQL Node NameCan create security risks by potentially disrupting access controls and authentication processes, which could lead to unauthorized access or compromise data integrity if not correctly managed.
Change in Database StatusThe database may be temporarily inaccessible or vulnerable to unauthorized access, data corruption, or loss if the change is not properly managed and secured.Read more
Change in Database Mirror StatusDisrupts data synchronization and failover mechanisms, potentially leading to data loss, inconsistent backups, or exposure to unauthorized access if the mirrored database becomes vulnerable.Read more
Change in Database AlwaysOn StatusCompromises high availability and failover protection, potentially leading to data loss, downtime, or increased vulnerability to unauthorized access during the transition.Read more
Failed Login AttemptsCan indicate brute force attacks or unauthorized access attempts, potentially leading to account compromise if not properly monitored and mitigated.Read more
Permission ViolationUnauthorized users access attempts to sensitive data could indicate malicious intentions to perform actions that could compromise the integrity, confidentiality, or availability of the database.
Object Not FoundA high number of object not found exceptions with different object names or identifiers can indicate a potential enumeration attack, where an attacker is systematically probing the database to identify existing objects for further exploitation.

Was this article helpful?