Speaker: Emil Kotrc
Source : Slides
There is no performance difference between left and right as Db2 z/OS will change do right as left
1-- Query 1 : Filtering data from resultset2SELECT d.deptno, d.deptname, e.lastname3FROM dept d LEFT JOIN emp e4ON d.deptno = e.workdept5WHERE d.deptno like 'D%'6-- Note : Filtering in WHERE clause happens first and then join78-- Query 2 : Filtering rows to join(does not filter data from result set)9SELECT d.deptno, d.deptname, e.lastname10FROM dept d LEFT JOIN emp e11ON d.deptno = e.workdept12AND d.deptno like 'D%'13-- Note : Here it joins only department starts with 'D%' and other rows are retreived as its LEFT JOIN.
1-- Shows Top 3 salaries in each department2SELECT workdept, deptname, lastname, salary3FROM dept d,4TABLE(SELECT workdept, lastname, salary FROM emp e5 WHERE e.workdept = d.deptno6 ORDER BY salary DESC7 FETCH FIRST 3 ROWS ONLY)8ORDER BY deptno, salary DESC, lastname;
By adding TABLE keyword(also known as lateral correlation) Db2 allows a join of a table or a nested table expression, which can have a correlated reference to the previous(left) table, not just a reference to a table from a higher level. This is called lateral join.
Masking bonus column
1CREATE MASK bonus_mask ON emp2FOR COLUMN bonus3RETURN4 CASE5 WHEN (bonus > 500.00) THEN NULL6 ELSE bonus7 END8ENALBE;910ALTER TABLE emp ACTIVATE COLUMN ACCESS CONTROL;
Mask is not encryption
Speaker: Lena Rösch
Source : Slides
Speaker: Steve Warren
Source : Slides
Mainframe was named z/OS in 2001 and before that it was called OS/390(1996).
Each mainframe box is called as a CPC(central processor complex) that includes main storage, one or more central processors, timers, and channels.
Each CPC can have upto 85 LPARS(Logical Partitions) meaning upto 85 different instances of OS can be running concurrently on one box and those OS can have their own dedicated memory, processors and shares resources and can have dedicated I/O. Each LPAR can have 16TB of memory in z/OS 2.5.
Z/OS is a collection of elements and base elements are always included in the z/OS as they deliver essential OS functions. Other base elements are,
z/OS Software Stack
DASD - Direct Access Storage Device
VTOC & Catalog
TSO OMVS- use
oedit filenameto edit file.
mvslogin. Slide 102
z/OS has storage ranges that contain code/data that is common to all address spaces. i.e., all applications are referencing same piece of storage regardless of which address space you are in like certain functions of OS need to accessible to all address spaces, certain storage need to be accessible by all address spaces.
Whats left is the Private Area that is available for your use. This areas, normally not accessible by other address spaces. But there are programmable interfaces that allow other applications running in other address spaces to access storage in different address spaces. Your programs will be loaded in this private area in your address space.
An address space size is limited to the amount of memory that can be addressed.
Now we needed a way to indicate which addressing mode the program supported.
AMODE : Addressing mode
RMODE : Residency mode - Since AMODE 31 programs could be run/loaded above or below the line a way was needed to request where the program should reside.
|24||24||Program can access data below the line and program is loaded below the line|
|31||31||Program is loaded above the line and can access data that is in BTL or ATL|
|31||24||Program resides BTL and can access data that is BTL or ATL|
|24||31||No possible as Branches within program would fail since the target location is in 31-bit storage but the address being used is a 24-bit address. Basically cant have AMODE < RMODE.|
The amount of memory required by all the running tasks is usually much greater than the amount of real storage available. This brings the concept of Virtual Memory.
Every OS has concepts of Swap disk, swap files.. z/OS is the first OS to have this concept implemented.
When a program is executing, it needs to be in real memory as when processor is executing instructions, it expects the instructions to be in real memory. But when they are not running, they can be in Auxiliary Storage.
Idea of Virtual Storage involves, knowing when bring into real storage and when it can be in auxiliary storage.
Virtual, Real and Auxiliary is divided into 4K Chunks.
Dynamic Address Translation - OS converting virtual addresses into real addresses. When a program is running, the processor must be able to locate its virtual pages in real storage.
Speaker: Jim Dee
Source : Slides
In the beginning, DBAs were using
EDITPROC utility to save disk space which is a exit routine.
In Db2 V3, IBM came with compression on engine itself and they exploited a hardware instruction(CMPSC), its an expensive instruction but its very cheaper when compare to writing compression softwares.
CMPSC depends on a dictionary which depends on strings found in the table, it implements Lempel-Ziv algorithm. Dictionary is created during load or reorg utility. Copy of the dictionary is stored in disk as part of table.
Overhead pages : Load utility write images of each page of the table to the end of bufferpool. They are very important, but has nothing to do with compression.
Uncompressed rows : Load writes uncompressed rows to first data pages of the table as its doing that its building the dictionary in memory.
Dictionary pages : When enough dictionary is built for compression to use. Dictionary pages are added before the data pages in the structure
Compressed rows : Load utility uses the dictionary to compress the rows.
Above diagram could represent a partition or partition of PBR but each partition of a PBG has its own dictionary but it should look similar.
Use DSN1COMP to analyze compression
Db2 uses row level hardware compression. Each row is decompressed before passed from bufferpool. All SQL sees is decompressed data.
Copies and logs are compressed
Application performing random access retreiving less rows are effective than sequential access as they have to decompress less number of rows.
In Db2 10, IBM added Auto Compression. Can now build dictionary without Load or Reorg.
In Db2 12, FL504, IBM added Huffman compression
In FL509, IBM added Huffman compression at object level( can mention in DDL -
COMPRESS HUFFMAN )
Compressed on disk and other times you are seeing the expanded page in bufferpool.
Only leaf pages are compressed
Compression is done by software.
4K on disk but 8/16/32K in bufferpool.
If you are going read one key from a page, entire page must be decompressed.
Db2 uses a special 4k buffer which is page-fixed in Db2 memory and its not a bufferpools.
Here sequential access performs better than random access. Complete opposite to tablespace compression.
In Db2 12, FL 500, IBM added online ALTER INDEX COMPRESS YES. Index doesn't go offline(rebuild pending). Now its in advisory and index will be available.
Expanding z/OS Software Ecosystem
Traditionally z/OS was all about running OLTP applications using middleware products such as CICS, Db2, MQ and they all have high availability, high throughput transactions on the system. Common languages found in z/OS Assembler, COBOL, PL1 and Batch programs which were run in middleware infrastructures.
Around 25yrs back, Unix System Services were introduced(TSO OMVS) and z/OS is a certified Unix Operating System to this day. With this it was easy to port unix applications to z/OS and recompile and run it on z/Os Unix and this led to running C++, Java and WebSphere application in z/OS Unix. Now, we were able to run z/OSMF, Spark and Zowe applications on z/OS Unix.
Next evolution in z/OS Unix is z/OS Container Extensions(zCX). With this, we can take Linux on Z Binaries and deploy them directly in z/OS as docker containers. So now, we can run mongoDB, Kafka, NGINX, Spark, Maven and Redis in z. Requires z14 or later.
Applications which are running in linux and interacting with z/Os anyway those are the workloads which could be run in zCX. What those applications get out of the box is, improved network transaction rates by 81% and reduced latency by 45%.
IBM provides a turnkey solution,
Usecases of zCX
IBM zCX - z/OS Storage Integration
Virtualization Layer is not provisioned as part of zCX address space but its a part of z/OS Base in BCP Base. So you can get upgrades to this part alone. This manages the storage, network and console access.
zCX Applicance contains the Linux Kernel, Docker Engine and zOSMF Workflows. zCX Applicance code is deployed into a separate Address Space, Virtualization Layer code is not part of this zCX address space.
zCX - CPU, Memory, Storage and Workload Management
Speaker: Rob Gould and Gareth Copplestone-Jones
Introduced in Db2 11 - Objective was to avoid the need to modify application programs when migrating a Db2 system to a new release. This was a way to ignore pre-reqs before migrating to new version.
In static SQLs, in the BIND PACKAGE command you can mention.
In Dynamic DBA and Developer Packages, you can mention special register
SET CURRENT APPLICATION COMPATIBILITY='V12RM503'
In Generic Dynamic SQL Packags e.g., NULLID. Which packages does it pick, thats where it get quite messy.
Note : IFCIDs to capture and report application incompatibilities(366,376)
Four levels are associated with continous delivery
Maintenance level - lifted by applying maintenance which contains defect fixes and enchancements, also known as (Db2) code level.
Catalog level - vehicle to enable new Function Level, needed for some Function Levels
Function level - needs to be activated, introduces new Db2 for z/OS features and functionality, with no impact on existing application behaviour.
APPLCOMPAT - set at the package level, providing an "island of stability" for a given application
When to BIND/REBIND with a higher APPLCOMPAT ?
Multiple collections for NULLID packages are already used by some sites for managing KEEPDYNAMIC attribute and for High Performance DBATS
Speaker: Huong Tran
Source : Slides
Speaker: Mark Rader
Source : Slides
ssnmIRLMallocates lock structure: dsngrpnm_LOCK1 (LOCK1)
ssnmMSTRallocates shared communication area: dsngrpnm_SCA(SCA)
ssnmDBM1allocates group buffer pool(GBP) for each local BP with shared data
Below structures are required by Db2,
This was designed in mid-90s.
GBPs is spread across two coupling facility. Idea is some large ones in one coupling facility and smallers ones are in another. Its done to balance memory and CPU
Failure to any single CEC(of total 4) tolerated
In this IBM introduced Integrated CF(ICF). Here LPAR having one of the CF is put on a system where there is already a z/OS and Db2.
Around the same time Duplexed GBPs were introduced as well. Here Db2 is writing to XES API to manage how the GBPs allocated and where primary is based on overall load.
Here as long as we have SCA and LOCK1 on external CF system, we can tolerate failure on any single system.
In System Managed Duplexing, CF Structure rebuild is a process by which sysplex allocates new instance of CF structure, populates the structure with data, and proceeds to use teh new structure instance.
Simplex rebuild - Discards the old instance, uses the new.
Duplex rebuild - Uses both instances, sychronnizing updates so the structure contents remain the same.
There are two ways to rebuild,
User managed - Tailored to application, involves significant programming effort, atleast one of the Db2 must be connected to structure to rebuild it.
System managed - General purpose, no development cost as system does the work, it can rebuild the structure from one CF to another CF even if there are no connectors.
System managed is the best
Percentage of requests that get duplexed
Duplexed request increases cost of z/OS CPU(3x to 4x, CF CPU(4x to 5x) and CF Link(6x to 8x).
Here Exploiter is IRLM,
The coupling facilities communicate to coordinate processing of the request so that both structure instances make the same update. Its Duplexed.
After both agree(3,4), the response comes to XES. Now XES knows that request gone in boths the CFs and in sync in both the places.
XES responds back to IRLM(Exploiter) and its passes to application.
Note: Synchronous cannot continue processing until both instances of the lock structure are updated.
This is from Db2 12,
IRLM makes the request.
XES takes that request and sends it to the primary structure
Primary Structure(CF1) add a sequence number(OSN and LCOSNP) on the request and responds back.
XES keeps track of primary and secondary OSNs. Its responds back to IRLM with Primary sequence number.
IRLM responds back to application
Every often XES queries secondary to know the status by checking Secondary OSN.
Note: Db2 will not write teh log output buffer until the corresponding lock information is updated in the secondary structure.
In a parallel sysplex data sharing environment there would be multiple z/OS systems even though each one finding out from Primary whats going on. The values of operation sequence number(LOSN and LCOSNP) will be different across z/OS systems. It keeps track by a Secondary Update Recovery Table(SURT). Failover for async duplex might be a little slower than sync duplex(both CF are identical).
Note: program can continue processing while the secondary instance of the lock structure is updated.
Speaker: Dan Dirkse
Source : Slides
Speaker: Keith Sisson
System Z - Z stands for Zero down time. It can be refered to as
MIPS - Millions of Instructions Per Second, its the measure of the capcity of the machine. Higher the MIPS, more it will cost to operate the machine.
MSU - Millions of Service Units, its the measure of the theoretical capacity of the machine to operate.
Processors are configured to operate differently.
There are two types of processors that are part of mainframe,
Newest Chip from IBM is Telum Chip. Its a 7nm chip, it refers to the size of the transistor on the chip. There's 22.5 Billion transistors on the chip, 8 cores, 5Ghz+ Clocks. Thickness of human hair is 80,000nm-100,000nm
All the tapes and DASD are connected via FICON cables
PR/SM is a hardware hypervisor inside of the mainframe. This is a firmware code that runs on mainframe allows logically partition the device. It can be thought of VMWare.
Address spaces are inside an z/OS LPAR. There are different types of address spaces like,
All address spaces are started identical, created by OS and programs are loaded into them and executed.
An address space can be think of as a giant excel spreadsheet and its 16Exabytes thats 1Million terabyte.
When its first created, it looks like an empty spreadsheet then OS loads few things for you below 16MB LINE as thats what original MVS can talk to as it had 24-bit address.
CICS is a subsystem runs in a address space but refered to as a CICS Region. Its sort of a web server.
Here two lpars are connected via Couple Facility and bunch of CICS Regions running on different lpars for redundancy. CICS Regions on same lpar are grouped and called as CICS System Group and entirely called as CICS Plex.
Db2 Datasharing similar to CICS
Speaker: Michal Bialecki
With any DML(IUD) processing disorganizes data in tables,
REORG TABLESPACE reorders tablespace/partition according to clustering, reclaims fragmented space, materializers pending definition changes
REORG INDEX reorganizes an index space to reclaim fragmented space
RUNSTATS gathers metadata info and info about data characteristics/frequency/cardinality/distribution in tablespace, indexes and partitions. Statistics info quality may degrade as application performs IUD which affects AP choices for
BIND/REBIND prepares and preserves access path runtime and stores in Db2 catalog/directory, from where it can be loaded into Db2 and (re)executed.
With STATSINT zPARM interval in-memory objects statistics are externalized to catalog tables
With every BIND/REBIND/PREPARE there is a SYSIBM.SYSSTATFEEDBACK catalog table updated to indicate missing/conflicting statistics.
SYSSTATFEEDBACK table has a column REASON with following values and their respective description and statements for RUNSTATS in Db2 12 are generated in SYSIBM.SYSTABLES_PROFILES.PROFILE_TEXT
|BASIC||Basic statistic is missing|
|KEYCARD||KEYCARD statistic is missing|
|LOWCARD||Column has low cardinality which indicates data skew is likely|
|NULLABLE||Distribution statistics not available for a nullable column|
|IXPROBE||Index probing was used to improve very low filter factor|
|DEFAULT||Predicate references a value that is probably a default value|
|RANGEPRD||Histogram statistics not available for a range predicate|
|PARALLEL||Parallism could be improved by uniform partitioning of key ranges|
|CONFLICT||Another statistic conflicts with this statistic|
|COMPFFIX||Multi-column cardinality statistics are needed for an index composed filter factor|
Does this solve all the AP problems, NO. Think of this as a suggestion.
Use DSNACCOX formulas to get candidate objects for REORG/RUNSTATS
When data is deleted, in the index its pseudo deleted but data still exists there, so for any other queries, that data will be read and based on the flag it will know its invalid and it will skip it. Point is, there is an IO for this.
Index page splits, when leaf page is full, Db2 will create a new page and move some(50%) of the entries in the old page to new page, this is to balance index tree. This technique doesn't work well with sequential inserts as half of the space in old index page is not used.
INDEX_CLEANUP_THREADS=10means there will be 10 system tasks to do clean up automatically, it cleans up pages that are touched by GETPAGE processing and will be limited to quite periods. This reduces the need for REORG INDEX. IFCID 377 tracks cleanup at the index page level.
RUNSTATS is not recommended outside of REORG, use it only with conjuction with REORG table
RUN ONLY - REORG IX without STATISTICS. Otherwise it may create time drift between AP stats for table and index
Don't use FREQVAL by default
RUNSTATS INDEX ... FREQVAL NUMCOLS n - n>1 are often not useful
Don't use HISTOGRAMS by default, they can quickly become stale for ascending columns(especially dates/timestamps)
From V12, INVALIDATECACHE NO is the default -- Running RUNSTATS will not purge Cached Dynamic Statements(DSC). If you want to invalidate DSC use
Runstats may not be necessary for VOLATILE tables as when Db2 executes queries against volatile tables, Db2 chooses index access whenever possible.
Db2 provides many features to reduce the introduction of new access paths and providing stability and reliability in performance
New REBIND Behaviour from Db2 12 FL505