Skip to content
bobby_dreamer

Notes from GSE UK Conference 2021

db2-notes, mainframe24 min read

  1. 2AD - The joy of joins
  2. 6AD - Mainframe Hardware 101
  3. 2AE - Intro to zOS
  4. 2AH - Squeeze Your Data with Db2A History of compression
  5. 5AK - zOS Container Extensions Part 1 : An Introduction to Running Containers Inside zOS
  6. 7AM - APPLCOMPAT, NULLID and Me
  7. 2AO - How does Db2 z transform your SQL query for better performance
  8. 2AP - Db2 12 for zos and Asynchronous Systems Managed CF Structure Duplexing Update
  9. 2AR - ISPF Master Class - Edit from A to ZLAST
  10. 6AT - zOS Terminology
  11. 4AU - Db2 zOS housekeeping best practices

2AD - The joy of joins

Speaker: Emil Kotrc
Source : Slides

Left outer join

  • Driver table is on the left side
  • NULL supplying table is on the right side

Right outer join

  • Driver table is on the right side
  • NULL supplying table is on the left side

There is no performance difference between left and right as Db2 z/OS will change do right as left

Full outer join

  • Driver and NULL supplying tables on both sides

Outer joins and filtering conditions

1-- Query 1 : Filtering data from resultset
2SELECT d.deptno, d.deptname, e.lastname
3FROM dept d LEFT JOIN emp e
4ON d.deptno = e.workdept
5WHERE d.deptno like 'D%'
6-- Note : Filtering in WHERE clause happens first and then join
7
8-- Query 2 : Filtering rows to join(does not filter data from result set)
9SELECT d.deptno, d.deptname, e.lastname
10FROM dept d LEFT JOIN emp e
11ON d.deptno = e.workdept
12AND d.deptno like 'D%'
13-- Note : Here it joins only department starts with 'D%' and other rows are retreived as its LEFT JOIN.

Lateral Join

1-- Shows Top 3 salaries in each department
2SELECT workdept, deptname, lastname, salary
3FROM dept d,
4TABLE(SELECT workdept, lastname, salary FROM emp e
5 WHERE e.workdept = d.deptno
6 ORDER BY salary DESC
7 FETCH FIRST 3 ROWS ONLY)
8ORDER BY deptno, salary DESC, lastname;

By adding TABLE keyword(also known as lateral correlation) Db2 allows a join of a table or a nested table expression, which can have a correlated reference to the previous(left) table, not just a reference to a table from a higher level. This is called lateral join.

Masking Data

Masking bonus column

1CREATE MASK bonus_mask ON emp
2FOR COLUMN bonus
3RETURN
4 CASE
5 WHEN (bonus > 500.00) THEN NULL
6 ELSE bonus
7 END
8ENALBE;
9
10ALTER TABLE emp ACTIVATE COLUMN ACCESS CONTROL;

Joins Unmasked

Mask is not encryption

Overview of Join Methods

Nested loop join
  • The inner table is scanned repeatedly
    • As many times as qualifying rows from the outer table
  • Used when
    • The outer table is small or the number of qualifying rows is small
    • Highly clustered index exists on the join columns of the inner table
      • Or Db2 can create a sparse index on the inner table
    • No join columns exist(Cartesian join, cross join)
  • Db2 can sort the outer table under certain conditions
    • Join columns are not in the same sequence
    • Join columns of the outer table has no index
    • The index is poorly clustered
  • Sparse Index (Db2 z/OS)
    • A sparse index is an index pointing to a range of n contiguous rows rather than a specific RID
    • Hasing support since Db2 10
    • Access to inner table is more efficient when there is no efficient index
    • A sort of the outer table is avoided
Merge scan join
  • Merge join or sort merge join
  • Requires one or more equi predicates
  • Db2 scans both tables in order of the join columns
    • Db2 may sort the tables
  • The inner table is put into the work file
    • The outer only if sorted
  • When a row from outer matches a row from inner, a combined row is returned.
  • Continue with another row from inner
  • When a match is no longer found, Db2 reads another row of the outer
  • If a unmatched row is found
    • It is discarded for an inner join
    • It is discarded if it is a left join and it comes from inner table
    • It is kept if it is a left join and it comes from an outer table
    • It is kept for full outer join
  • Used when
    • One or more equi predicates
    • Many to many join
    • Large tables with no indexes with matching columns
    • Few columns are selected on inner tables
      • This is the case when a Db2 sort is used
      • The fewer the columns to be sorted, the more efficient the sort is
Hybrid join
  • Applies only to inner join
  • Requires an index on the join column on the inner table
  • Db2 scans the outer table
  • Joins the outer table with record IDs(RIDs) from index on the inner(phase 1)
  • Db2 sorts the data in outer table and the RIDs(phase 2)
  • Db2 retrieves data from the inner table using list prefetch
  • Db2 combines the data from inner and from phase 2
  • Used when
    • A non-clustered index or indexes are used on the join columns of the inner table
    • The outer table has duplicate qualifying rows
Hash join
  • Requires one or more equi predicates
  • Build phase - Db2 reads the inner table and builds an in memory hash table
    • For qualified rows - each key is hashed and stored in the hash table with the row
    • Db2 maintains a chain of collisions
  • Probe phase - Db2 reads the outer table
    • The key is hashed and searched in the hash table
  • Db2 for z/OS equivalent
    • Nested loop join with sparse index and hashing
Star join
  • Join tables from a star schema
    • A fact table with a number of dimension tables
    • Star schema involves three or more tables
  • Db2 joins dimension tables to the fact table according to the multi-coumn index that is defined on the fact table
  • requires a single index on the fact table that supports the filtering that is provided by the dimension tables of the start schema

6AD - Mainframe Hardware 101

Speaker: Lena Rösch
Source : Slides


2AE - Intro to zOS

Speaker: Steve Warren
Source : Slides

  • Mainframe was named z/OS in 2001 and before that it was called OS/390(1996).

  • Each mainframe box is called as a CPC(central processor complex) that includes main storage, one or more central processors, timers, and channels.

  • Each CPC can have upto 85 LPARS(Logical Partitions) meaning upto 85 different instances of OS can be running concurrently on one box and those OS can have their own dedicated memory, processors and shares resources and can have dedicated I/O. Each LPAR can have 16TB of memory in z/OS 2.5.

IBM Z Server

  • Z/OS is a collection of elements and base elements are always included in the z/OS as they deliver essential OS functions. Other base elements are,

    • Communication Server, z/OSMF, TSO, ISPF, JES2
    • There are optional features which are installed in addition to base elements like SDSF, RMF, JES3, DFSMShsm, DFSMSdss
  • z/OS Software Stack

    • Operating System : Provides critical set of system services made possible by the z/Architecture and firmware. Contains base elements and optional features.
    • Middleware : Software that is neither operating system code nor end-user application. Some middlewares are Db2, CICS, IMS, WAS, MQ, SAP(Systems, Applications and Products in Data Processing), TSO/E
    • Applications : Provide end-user interface to product solutions. Like ISPF, CICS Applications.

Software Stack

Storage and Files

  • DASD - Direct Access Storage Device

    • Usually storages are kept in a separate box and they called as DASD Devices
    • DASD are Hard Drives and they are made up of volumes and volumes need serial number(VOLSER) which is made up to 1 to 6 characters
    • Multiple DASD volumes are attached to mainframe operating system
  • Dataset

    • Files are stored in preallocated areas on the hard disk. In mainframe you have to create space in advance.
    • Dataset name is made up of 1 to 22 segments(21 dots) and each segment separated by a period(.) and limited to 8 characters. First segment is called the High-level qualifier and last segment is called the Low-level qualifier.
  • VTOC & Catalog

    • Volume Table of Contents(VTOC), each DASD volume has a VTOC that locates a dataset on the volume.
    • User catalog contains name and location(VOLSER) information
    • There can be thousands of user catalog, to know which catalog contains this information, we have Master catalog. It will basically say, any dataset that starts with this HLQ can be found in this User catalog.

VTOC

  • TSO/E

    • Time Sharing Option/Extension allows users to log on and interactively share resources
  • ISPF

    • Interactive System Productivity Facility
    • Its a hierarchical view of panels(Mainframe UI).
    • Using ISPF one can create datasets, write program, submit jobs and check logs at various screens.
  • z/OS Unix

    • Unix files are not stored in datsets, they use zSeries File System. Similar to a normal unix system but it runs on z/OS.
    • Filenames can be upto 1023 characters and its case-sensitive
    • There are multiple ways to access files in z/OS environments.
      • iSHELL - TSO ISHELL
      • z/OS Unix Directory List Utility - 3.17 in ISPF
      • OMVS - TSO OMVS - use oedit filename to edit file.
      • Remote unix interfaces - Telnet, SSH, NFS. With NFS, you can mount unix directories on windows drive using mvslogin. Slide 102

Address Spaces

  • Address spaces are range of addresses in virtual storage
  • Each user gets an address space containing the same range of addresses as any other user. This give you addressing range where you can load applications and application can load data to process. Each user gets 0-16EB(Exabytes) of addressability.
  • z/Os creates address spaces for some of its interal components(~35 address spaces). Eg: Master, PCAuth, RASP, Trace, DumpSrv, XCFAS, GRS....
  • In addition there are lots of other address spaces for middleware and applications created like SDSF, VTAM, TCAS, TCPIP...
What makes up an Address Space

zOS Address Space

  • z/OS has storage ranges that contain code/data that is common to all address spaces. i.e., all applications are referencing same piece of storage regardless of which address space you are in like certain functions of OS need to accessible to all address spaces, certain storage need to be accessible by all address spaces.

  • Whats left is the Private Area that is available for your use. This areas, normally not accessible by other address spaces. But there are programmable interfaces that allow other applications running in other address spaces to access storage in different address spaces. Your programs will be loaded in this private area in your address space.

How much memory can your program can address

An address space size is limited to the amount of memory that can be addressed.

  • In 1977, an address space was limited to 16MB
  • In 1981, an address space size was increated to 2GB. This is when concept of The Line was introduced.
    • Programs not changed to support 31-bit addressing had to run and could only access data that was "Below the line"
    • Programs that were changed to support 31-bit addressing could run and could also access data that was "Above the line".
  • In 2000, an address space was increated to 64-bit or 16EB addressing. During then introduced "The Bar".

Program Addressability

Now we needed a way to indicate which addressing mode the program supported.

  • AMODE : Addressing mode

    • 24 : Applications can only address data that is below the line
    • 31 : Applications can address data below or above the line
  • RMODE : Residency mode - Since AMODE 31 programs could be run/loaded above or below the line a way was needed to request where the program should reside.

AMODERMODEDescription
2424Program can access data below the line and program is loaded below the line
3131Program is loaded above the line and can access data that is in BTL or ATL
3124Program resides BTL and can access data that is BTL or ATL
2431No possible as Branches within program would fail since the target location is in 31-bit storage but the address being used is a 24-bit address. Basically cant have AMODE < RMODE.
  • AMODE 64 was introduced when 64Bit was introduced and data can be accessed above the bar.

Virtual Storage

  • The amount of memory required by all the running tasks is usually much greater than the amount of real storage available. This brings the concept of Virtual Memory.

  • Every OS has concepts of Swap disk, swap files.. z/OS is the first OS to have this concept implemented.

  • When a program is executing, it needs to be in real memory as when processor is executing instructions, it expects the instructions to be in real memory. But when they are not running, they can be in Auxiliary Storage.

  • Idea of Virtual Storage involves, knowing when bring into real storage and when it can be in auxiliary storage.

  • Virtual, Real and Auxiliary is divided into 4K Chunks.

    • 4KB pieces of Virtual Storage is called pages.
    • 4KB pieces of Real Storage is called frames.
    • 4KB pieces of Auxiliary storage is called slots.
  • Dynamic Address Translation - OS converting virtual addresses into real addresses. When a program is running, the processor must be able to locate its virtual pages in real storage.

Batch Processing

  • Batch processing involves jobs running without user interaction.
  • In JCL, you say where is the program load located, wheres the input to the application program and where is the output is going to go.
  • When a JCL is submitted, it is received in the JES(Job Entry System), queues job waiting to be executed, manages priority, schedules processing and controls output processing.
  • An initiator system program, processes the JCL, creates an environment in an initiator address space and run the job in that address space.

IPL

  • Initial Program Load a.k.a booting the z/OS system.
  • Special console is used which is connected to network of mainframe boxes or Harware Management Console(HMC) via Browser where we load Operating System of a particular partition thats going to start the boot of the operating system(z/OS). Load profile knows the device address and its going to actually kickoff the boot of the IPL.
  • Operating System is on the special volume called the System Resident Volume(SYSRES or SYSRS1)
  • IPL is processed in phases and each phase builds on previous. First phase is just where the OS is located, device number of the SYSRES.
    • Hardware IPL
    • IPL RIMs (Resource Initialization Modules)
    • NIP RIMs
    • MSI : Master Scheduler Initialization. In this phase server important address spaces are created like SMF, System Logger, BCPii(Base Control Program internal interface)
    • JES
    • Then RACF, ICSF, VTAM, TCAS, TCP/IP(with this starting, IPL is considered complete)
    • After this lot of middleware(Db2, CICS, WAS, IMS..) and application needs to be started

2AH - Squeeze Your Data with Db2A History of compression

Speaker: Jim Dee
Source : Slides

Tablespace Compression

In the beginning, DBAs were using EDITPROC utility to save disk space which is a exit routine.

  • Idea is to compress each row before you put them in the bufferpool and expand when it comes out of the bufferpool.

Db2 Editproc

In Db2 V3, IBM came with compression on engine itself and they exploited a hardware instruction(CMPSC), its an expensive instruction but its very cheaper when compare to writing compression softwares.

CMPSC depends on a dictionary which depends on strings found in the table, it implements Lempel-Ziv algorithm. Dictionary is created during load or reorg utility. Copy of the dictionary is stored in disk as part of table.

Load utilty compressing pages

Steps are,

  1. Overhead pages : Load utility write images of each page of the table to the end of bufferpool. They are very important, but has nothing to do with compression.

  2. Uncompressed rows : Load writes uncompressed rows to first data pages of the table as its doing that its building the dictionary in memory.

  3. Dictionary pages : When enough dictionary is built for compression to use. Dictionary pages are added before the data pages in the structure

  4. Compressed rows : Load utility uses the dictionary to compress the rows.

Db2 Tablespace Compression

Above diagram could represent a partition or partition of PBR but each partition of a PBG has its own dictionary but it should look similar.

  • Use DSN1COMP to analyze compression

    • Recommended to run against a imagecopy
  • Db2 uses row level hardware compression. Each row is decompressed before passed from bufferpool. All SQL sees is decompressed data.

  • Copies and logs are compressed

  • Application performing random access retreiving less rows are effective than sequential access as they have to decompress less number of rows.

  • In Db2 10, IBM added Auto Compression. Can now build dictionary without Load or Reorg.

  • In Db2 12, FL504, IBM added Huffman compression

  • In FL509, IBM added Huffman compression at object level( can mention in DDL - COMPRESS HUFFMAN )

    • With HUFFMAN, CPU usage might be higher but disk savings should be greater.

Index Compression

  • Compressed on disk and other times you are seeing the expanded page in bufferpool.

  • Only leaf pages are compressed

  • Compression is done by software.

  • 4K on disk but 8/16/32K in bufferpool.

Index Compression

  • If you are going read one key from a page, entire page must be decompressed.

  • Db2 uses a special 4k buffer which is page-fixed in Db2 memory and its not a bufferpools.

    • An Index Page from BP is compressed into a 4K image(special buffer) and written to disk(usually multiple pages are written)
    • When reading, 4k page is read from disk into special buffer and then expanded into bufferpool.
  • Here sequential access performs better than random access. Complete opposite to tablespace compression.

  • In Db2 12, FL 500, IBM added online ALTER INDEX COMPRESS YES. Index doesn't go offline(rebuild pending). Now its in advisory and index will be available.

LOB Compression

  • Available from Db2 12 M500
  • Compression/Expansion at the time of I/O. Only compressed in Disk.
  • LOB Compression exploits cache processor z/EDC
  • There is a no separate dictionary pages, the dictionary is stored in compression string in the LOB itself.

5AK - zOS Container Extensions Part 1 : An Introduction to Running Containers Inside zOS

Speaker: Steve Warren
Source : Part 1 Slide, Part 2 Slide

Expanding z/OS Software Ecosystem

  • Traditionally z/OS was all about running OLTP applications using middleware products such as CICS, Db2, MQ and they all have high availability, high throughput transactions on the system. Common languages found in z/OS Assembler, COBOL, PL1 and Batch programs which were run in middleware infrastructures.

  • Around 25yrs back, Unix System Services were introduced(TSO OMVS) and z/OS is a certified Unix Operating System to this day. With this it was easy to port unix applications to z/OS and recompile and run it on z/Os Unix and this led to running C++, Java and WebSphere application in z/OS Unix. Now, we were able to run z/OSMF, Spark and Zowe applications on z/OS Unix.

  • Next evolution in z/OS Unix is z/OS Container Extensions(zCX). With this, we can take Linux on Z Binaries and deploy them directly in z/OS as docker containers. So now, we can run mongoDB, Kafka, NGINX, Spark, Maven and Redis in z. Requires z14 or later.

  • Applications which are running in linux and interacting with z/Os anyway those are the workloads which could be run in zCX. What those applications get out of the box is, improved network transaction rates by 81% and reduced latency by 45%.

  • IBM provides a turnkey solution,

    • IBM has pre-packaged number of items inside the zCX server that enable you to get up and running right away. No need to procure linux kernal or docker engine.
    • Provisioned using z/OSMF workflows
    • IBM provides regular service to these appliances.
    • zCX workloads are zIIP eligible

zCX

  • Usecases of zCX

    • Microservice - logstash, Etcd, Wordpress, etc.
    • No-SQL - MongoDB, IBM Cloudant, etc.
    • Analytics frameworks - z/OS Spark, TensorFlow
    • Messaging Frameworks - IBM MQ, Kafka
    • App Connect Enterprise
    • Web server proxies
    • Emerging programing languages
  • IBM zCX - z/OS Storage Integration

    • IBM did not modify Linux Kernel to leverage z/OS technologies
    • What IBM did is ? zCX developers used a published interface that interacts with Storage and Networking called VirtIO devices. They are like device drivers.
      • Suppose from zCX you write to data, its going go to VirtIO Storage to zCX Virtualization Layer then to DFSMS VSAM to update.

Storage Integration

  • Virtualization Layer is not provisioned as part of zCX address space but its a part of z/OS Base in BCP Base. So you can get upgrades to this part alone. This manages the storage, network and console access.

  • zCX Applicance contains the Linux Kernel, Docker Engine and zOSMF Workflows. zCX Applicance code is deployed into a separate Address Space, Virtualization Layer code is not part of this zCX address space.

  • zCX - CPU, Memory, Storage and Workload Management

    • Each zCX server is provisioned as a Address space and each address space has specific assigned storage, network and memory resources.
    • Those server needs their own memory(fixed) above 2GB bar. When you page-fix memory that basically takes that memory off the table for other applications to use. That memory is not shared with other applications.
    • Shares CPU resources with other address spaces but can influence resource access via configuration and WLM policies.
    • zCX requires you to specify how many virtual CPUs it can use. Each virtual CPU is a dispatchable thread(ie., These CPUs translate into MVS TCB.). zCX utilizes zIIP CPU via MVS dispatcher. The dispatcher will dispatch these tasks on the available zIIPs or general purpose CPUs based on WLM priorities.
    • All memory is backed by pages(4k, 1M and 2G) in virtual storage. Bigger the page, the less z/OS overhead(2G fixed pages are for best performance).

zCX Memory/CPU


7AM - APPLCOMPAT, NULLID and Me

Speaker: Rob Gould and Gareth Copplestone-Jones

Managing APPLCOMPAT

Introduced in Db2 11 - Objective was to avoid the need to modify application programs when migrating a Db2 system to a new release. This was a way to ignore pre-reqs before migrating to new version.

  • In static SQLs, in the BIND PACKAGE command you can mention.

    • APPLCOMPAT(V12R1M505)
    • APPLCOMPAT(V12R1M100)
    • APPLCOMPAT(V12R1M503)
    • If not specified, it will use APPLCOMPAT zParm value.
  • In Dynamic DBA and Developer Packages, you can mention special register SET CURRENT APPLICATION COMPATIBILITY='V12RM503'

    • In Db2 12, CURRENT APPLICATION COMPATIBILITY must be <= the BIND APPLCOMPATIBILITY setting.
  • In Generic Dynamic SQL Packags e.g., NULLID. Which packages does it pick, thats where it get quite messy.

Note : IFCIDs to capture and report application incompatibilities(366,376)

Continous Delivery

Four levels are associated with continous delivery

  1. Maintenance level - lifted by applying maintenance which contains defect fixes and enchancements, also known as (Db2) code level.

  2. Catalog level - vehicle to enable new Function Level, needed for some Function Levels

  3. Function level - needs to be activated, introduces new Db2 for z/OS features and functionality, with no impact on existing application behaviour.

  4. APPLCOMPAT - set at the package level, providing an "island of stability" for a given application

  • Determines SQL function level of application
  • APPLCOMPAT attribute must be <= Function Level
  • APPLCOMPAT must be advanced via BIND/REBIND to exploit new SQL function

Exploiting New SQL Function

Exploiting New SQL Functions

When to BIND/REBIND with a higher APPLCOMPAT ?

  • Once you eliminate all the application incompatibilities for a given package.
  • You plan to exploit new functionality
  • When IBM has withdrawn support of the APPLCOMPAT level.

APPLCOMPAT and NULLID packages

  • NULLID : They are shared, generic packages used by client applications(JDBC, ODBC, CLI, .Net) accessing Db2 via DRDA over TCP/IP. Package names are,
    • SYSSHxyy : S indicates small package like small cursor size and H indicates CURSOR WITH HOLD. x indicates isolation levels like 1(UR), 2(CS), 3(RS) and 4(RR). yy is just sequential numbers between 00 - FF.
    • SYSSNxyy : Here N indicates you are using NOT WITH HOLD
    • SYSLHxyy : Here L indicates Large package
    • SYSLNxyy :

Multiple collections for NULLID packages are already used by some sites for managing KEEPDYNAMIC attribute and for High Performance DBATS

  • Use meaningful names for collections like NULLID_V12R1M504
  • Usage: BIND PACKAGE(NULLID_V12R1M504) COPY(NULLID.SYSLN201) ACTION(ADD) APPLCOMPAT(V12R1M504)
  • Specify collection/package set via CURRENT PACKAGESET or CURRENT PACKAGE PATH special registers. Do remmeber CURRENT PACKAGE PATH takes precedence if both specified.
  • Client-side configuration

    • Specify CURRENT PACKAGE PATH / CURRENT PACKAGE SET via:
      • currentPackageSet/currentPackagePath JDBC properties
      • CurrentPackageSet/CurrentPackagePath CLI/ODBC properties
    • Can also be specified in connection string
    • Cons being may application to configure
  • Server-side configuration

    • Functionality of CURRENT PACKAGE PATH at server side is achived by using System Profile Monitoring. We have two profile tables(not a catalog table).
      • SYSIBM.DSN_PROFILE_TABLE : defines filtering scopes, who or what is being monitored.
      • SYSIBM.DSN_PROFILE_ATTRIBUTES : defines the actions applied to whats being monitored.
      • This be started and stopped by -START PROFILE and -STOP PROFILE commands.

2AO - How does Db2 z transform your SQL query for better performance

Speaker: Huong Tran
Source : Slides


2AP - Db2 12 for zos and Asynchronous Systems Managed CF Structure Duplexing Update

Speaker: Mark Rader
Source : Slides

Parallel Sysplex basics for Db2 z/OS

  • In Data Sharing systems, Db2 subsystems are called as Members
  • Coupling Facility(CF) structures are used for high-speed sharing of locks, status information and data(tables and indexes)
  • ssnmIRLM allocates lock structure: dsngrpnm_LOCK1 (LOCK1)
  • ssnmMSTR allocates shared communication area: dsngrpnm_SCA(SCA)
  • ssnmDBM1 allocates group buffer pool(GBP) for each local BP with shared data
    • dsngrpnm_GBPn(GBP0, GBP1,... GBP8K0,... GBP16K0,... GBP32K,...)

Below structures are required by Db2,

  • LOCK1 and SCA, if Db2 cannot allocate them, Db2 will fail.
    • All IRLMs required to rebuild LOCK1
    • All MSTRs required to rebuild SCA

Original Parallel Sysplex Configuration

This was designed in mid-90s.

Original Parallel Sysplex

  • GBPs is spread across two coupling facility. Idea is some large ones in one coupling facility and smallers ones are in another. Its done to balance memory and CPU

  • Failure to any single CEC(of total 4) tolerated

    • If we loose CEC having CF that has SCA & LOCK1, 2 Db2s(IRLM and MSTR) will be used to rebuild structures in other CF by connectors
    • If we loose OS/390 or Db2 member, then we have to restart Db2 member to release retained locks.

Second Parallel Sysplex Configuration

In this IBM introduced Integrated CF(ICF). Here LPAR having one of the CF is put on a system where there is already a z/OS and Db2.

Around the same time Duplexed GBPs were introduced as well. Here Db2 is writing to XES API to manage how the GBPs allocated and where primary is based on overall load.

Second Version of Parallel Sysplex

Here as long as we have SCA and LOCK1 on external CF system, we can tolerate failure on any single system.

Third Parallel Sysplex Configuration

Third Version of Parallel Sysplex

In System Managed Duplexing, CF Structure rebuild is a process by which sysplex allocates new instance of CF structure, populates the structure with data, and proceeds to use teh new structure instance.

CF Rebuild

  • Simplex rebuild - Discards the old instance, uses the new.

  • Duplex rebuild - Uses both instances, sychronnizing updates so the structure contents remain the same.

There are two ways to rebuild,

  • User managed - Tailored to application, involves significant programming effort, atleast one of the Db2 must be connected to structure to rebuild it.

  • System managed - General purpose, no development cost as system does the work, it can rebuild the structure from one CF to another CF even if there are no connectors.

System managed is the best

Percentage of requests that get duplexed

  • Cache - 20% is typical [not Db2 GBPs]
  • List - 100% - Db2 SCA
  • Lock - 100% - Db2 Lock1

Duplexed request increases cost of z/OS CPU(3x to 4x, CF CPU(4x to 5x) and CF Link(6x to 8x).

Synchronous System Managed CF structure duplexing

Here Exploiter is IRLM,

  1. IRLM makes a request
  2. XES takes and splits the request and send to two CF structures.
  3. Both CFs exchange Ready To Execute(RTE) signals.
  4. Then both CFs exchange Ready to Complete(RTC) signals

The coupling facilities communicate to coordinate processing of the request so that both structure instances make the same update. Its Duplexed.

  1. After both agree(3,4), the response comes to XES. Now XES knows that request gone in boths the CFs and in sync in both the places.

  2. XES responds back to IRLM(Exploiter) and its passes to application.

SSM, CF Structure Duplexing

Note: Synchronous cannot continue processing until both instances of the lock structure are updated.

Asynchronous System Managed CF structure duplexing

This is from Db2 12,

  1. IRLM makes the request.

  2. XES takes that request and sends it to the primary structure

  3. Primary Structure(CF1) add a sequence number(OSN and LCOSNP) on the request and responds back.

  4. XES keeps track of primary and secondary OSNs. Its responds back to IRLM with Primary sequence number.

    • (ASYNC) : Sends commands sequenced structure object updates to secondary CF2 and it responds back indicating safe arrival/store and returns LCOSN
    • Secondary will execute the commands in the same order it was executed in the primary.
  5. IRLM responds back to application

Every often XES queries secondary to know the status by checking Secondary OSN.

ASM, CF Structure Duplexing

Note: Db2 will not write teh log output buffer until the corresponding lock information is updated in the secondary structure.

In a parallel sysplex data sharing environment there would be multiple z/OS systems even though each one finding out from Primary whats going on. The values of operation sequence number(LOSN and LCOSNP) will be different across z/OS systems. It keeps track by a Secondary Update Recovery Table(SURT). Failover for async duplex might be a little slower than sync duplex(both CF are identical).

Note: program can continue processing while the secondary instance of the lock structure is updated.


2AR - ISPF Master Class - Edit from A to ZLAST

Speaker: Dan Dirkse
Source : Slides


6AT - zOS Terminology

Speaker: Keith Sisson

  • System Z - Z stands for Zero down time. It can be refered to as

    • CEC - Central Electronics Complex. This is the New fashioned term
    • CPC - Central Processor Complex
  • MIPS - Millions of Instructions Per Second, its the measure of the capcity of the machine. Higher the MIPS, more it will cost to operate the machine.

  • MSU - Millions of Service Units, its the measure of the theoretical capacity of the machine to operate.

z13

  • Processors are configured to operate differently.

    • General Purpose Processor(GP) : All instructions for the machine are executed.
    • zIIP : This mostly runs JAVA Workload. It does not count towards MSUs towards the machine. You do not pay for the code that is executed in zIIP engine. Its to encourage to use JAVA in mainframe and reduce cost.
    • Internal Coupling Facility(ICF) :
    • IFL : This engine is designed to run zLinux workloads. Similar to zIIP.
  • Channels/Cards

    • Open System Adapter(OSA) : Its a Network card for the mainframe, prior to TCPIP being widely available, mainframes had their propritery own networking protocol(SNA). Modern MF is connected via TCP/IP just like anyother devices.
    • Other cards are Crypto and Ficon

Processors

There are two types of processors that are part of mainframe,

  • Processor Unit(PU) Chip : All GP, zIPP, IFL and ICF all types of engines on PU chip.
  • System Controller(SC) Chip : PU chips are controlled by SC chips in the drawer. SC chip co-ordinates activities between the PU chips and it also controls communication with other processor drawers in the path.

CPC Drawer

Newest Chip from IBM is Telum Chip. Its a 7nm chip, it refers to the size of the transistor on the chip. There's 22.5 Billion transistors on the chip, 8 cores, 5Ghz+ Clocks. Thickness of human hair is 80,000nm-100,000nm

Direct Access Storage Device(DASD)

  • Each storage device has a six character volser(volume serial)
Virtual Tape
  • Originally a tape was used to store data where batch job would be submitted to copy data from disk to tapes.
  • Now, we are using virtual tapes which are actually disks behind the box. Luminex Virtual Tape Controller emulates a tape drive and write to disk.

All the tapes and DASD are connected via FICON cables

  • Supports hundereds of concurrent 16Gbps fiber optic IO Connections
  • I/O link called 'IBM zHyperLink Express' that is designed to support up to 32 ports with link data rate of up to 8 GigaBytes per second.

SysPlex

  • Coupling facility is another standalone processor, it could be interal(ICF) or exteral as in below pic.
  • Connecting two CEC with coupling facility makes this a Parallel Sysplex
  • Coupling facility can be used to group two lpars in one machine or separate physical machines. Sysplex is just grouping of logical partitions.

Parallel Sysplex

PR/SM is a hardware hypervisor inside of the mainframe. This is a firmware code that runs on mainframe allows logically partition the device. It can be thought of VMWare.

Hardware Hypervisor

Address spaces

Address spaces are inside an z/OS LPAR. There are different types of address spaces like,

  • Address spaces used by system are called System Tasks(MASTER, CATALOG, CONSOLE, ALLOCAS)
  • There are address spaces started and controlled by programmers called Started Tasks like VLF, LLA, JES2, TCP/IP, CICS
  • Each person logs on to mainframe get their own address spaces.
  • Batch jobs

Address Space

All address spaces are started identical, created by OS and programs are loaded into them and executed.

An address space can be think of as a giant excel spreadsheet and its 16Exabytes thats 1Million terabyte.

When its first created, it looks like an empty spreadsheet then OS loads few things for you below 16MB LINE as thats what original MVS can talk to as it had 24-bit address.

Address Space Internals

CICS is a subsystem runs in a address space but refered to as a CICS Region. Its sort of a web server.

CICS Plex

Here two lpars are connected via Couple Facility and bunch of CICS Regions running on different lpars for redundancy. CICS Regions on same lpar are grouped and called as CICS System Group and entirely called as CICS Plex.

Db2 Datasharing similar to CICS Db2 Datasharing

  • APAR is used to describe a problem
  • PTF provides the resolution

4AU - Db2 zOS housekeeping best practices

Speaker: Michal Bialecki

Why Housekeeping is important ?

  • With SELECT, searched UPDATE - data in order is processed more efficiently ( less GETPAGES, less SYNC IO ) than disordered data, with indirect references to overflow records or pseudo deleted rows/mass deleted pages
  • With INSERT/UPDATE, search algorithm for empty row slots find empty spots quicker
  • Mitigate concurrency issues - for hot space map pages, deadlocks/timeouts
  • Saves DASD Space ( after REORG )
    • Getting rid of empty spaces, pseudo deleted and overflow (indirect) records, releases space in tables/indexes and VSAM EXTENTS

With any DML(IUD) processing disorganizes data in tables,

  • UPDATE data in place with longer value, that does not fit into original record in data page(overflows)
  • DELETE leaves pseudo deleted rows(indexes)
  • MASS DELETES - pseudo deleted pages / spacemap
  • INSERTS out of order due to locked pages
  • INSERTS to end of table partition (eg with MEMBER CLUSTER)
  • INSERTS into ALTERed datatypes columns(eg CHAR -> VARCHAR)

Overview - REORG / RUNSTATS / REBIND

  • REORG TABLESPACE reorders tablespace/partition according to clustering, reclaims fragmented space, materializers pending definition changes

  • REORG INDEX reorganizes an index space to reclaim fragmented space

  • RUNSTATS gathers metadata info and info about data characteristics/frequency/cardinality/distribution in tablespace, indexes and partitions. Statistics info quality may degrade as application performs IUD which affects AP choices for

    • For dynamic statements during PREPARE / RE-PREPARE
    • For static packages being under process of BIND/REBIND/AUTOBIND
  • BIND/REBIND prepares and preserves access path runtime and stores in Db2 catalog/directory, from where it can be loaded into Db2 and (re)executed.

How we may know what statistics to collect duirng RUNSTATS

  • With STATSINT zPARM interval in-memory objects statistics are externalized to catalog tables

    • SYSIBM.SYSTABLESPACESTATS
    • SYSIBM.SYSINDEXSPACESTATS
  • With every BIND/REBIND/PREPARE there is a SYSIBM.SYSSTATFEEDBACK catalog table updated to indicate missing/conflicting statistics.

  • SYSSTATFEEDBACK table has a column REASON with following values and their respective description and statements for RUNSTATS in Db2 12 are generated in SYSIBM.SYSTABLES_PROFILES.PROFILE_TEXT

ValueDescription
BASICBasic statistic is missing
KEYCARDKEYCARD statistic is missing
LOWCARDColumn has low cardinality which indicates data skew is likely
NULLABLEDistribution statistics not available for a nullable column
IXPROBEIndex probing was used to improve very low filter factor
DEFAULTPredicate references a value that is probably a default value
RANGEPRDHistogram statistics not available for a range predicate
PARALLELParallism could be improved by uniform partitioning of key ranges
CONFLICTAnother statistic conflicts with this statistic
COMPFFIXMulti-column cardinality statistics are needed for an index composed filter factor

sysibm.systables_profiles

Does this solve all the AP problems, NO. Think of this as a suggestion.

Use DSNACCOX formulas to get candidate objects for REORG/RUNSTATS

  • It also includes detection of RESTRICTED TS/IX states to be put in REORG object list of REORG.

When data is deleted, in the index its pseudo deleted but data still exists there, so for any other queries, that data will be read and based on the flag it will know its invalid and it will skip it. Point is, there is an IO for this.

Index page splits, when leaf page is full, Db2 will create a new page and move some(50%) of the entries in the old page to new page, this is to balance index tree. This technique doesn't work well with sequential inserts as half of the space in old index page is not used.

Pseudo-deleted index entries

  • zParm INDEX_CLEANUP_THREADS=10 means there will be 10 system tasks to do clean up automatically, it cleans up pages that are touched by GETPAGE processing and will be limited to quite periods. This reduces the need for REORG INDEX. IFCID 377 tracks cleanup at the index page level.

RUNSTATS is not recommended outside of REORG, use it only with conjuction with REORG table

Review the list of object candidates

Using DSNACCOX formulas build a scoring table
  • Evaluate each query predicate, how many rows it generates and choose top ones
  • Any deferred tablespace and index REORG needs to be prioritized for the next window(or consider increasing window duration)
Perform trend analysis on triggers for REORG and investigate object tuning
  • Increase PCTFREE to reduce index page splits ?
  • Define some PCTFREE FOR UPDATE ?
    • Can use AUTO(-1) which used RTS to adjust to optimum, but bear in mind that it will take several REORG cycles to learn into it
  • Increase SEGSIZE to provide better chance of maintaining clustering ?
  • Consider larger index page size to reduce index page splits ?
  • You may enchance the automation logic to escalate to a tablespace REORG if a significance number of partitions become a candidate for REORG.

RUN ONLY - REORG IX without STATISTICS. Otherwise it may create time drift between AP stats for table and index

  • Do not run RUNSTATS without REORG TABLESPACE
  • Do not run RUNSTATS on index, without RUNSTATS on table or on some partition only.
REORG Recommendations
  • Avoid transaction timeouts during REORG
    • (MAXRO + DRAIN_WAIT) < (IRLMRWT - 5 seconds)
  • For large objects, or when troubles to fit into maintenance windows:
    • Run REORG with MAXRO DEFER, schedule the REORG outside the REORG window, and ALTER UTILITY REORG MAXRO n inside REORG window
    • Can run REORG with FORCE ALL option to try and force the REORG through, but at the risk of transaction failures.
  • Use DRAIN ALL - don't use DRAIN WRITERS - you would need to DRAIN twice(once for writers and once for readers), WRITERS and ALL(one time itself it will drain WRITERS and READERS)
  • Consider using option DRAIN_ALLPARTS YES
    • Helps avoid undetected deadlocks between CLAIMS and DRAINS when running REORG at the PART level when one or more NPIs on table
  • zPARM REORG_DROP_PBG_PARTS shall be set to DISABLE, otherwise, if partition is dropped you cant recover to PIT prior to REORG
RUNSTATS Recommendations
  • Don't use FREQVAL by default

    • Do not specify it in your RUNSTATS job unless it was recommended by SYSSTATFEEDBACK or Data Studio
    • If you exclude FREQVAL - we will by default collect on the leading column of each index
  • RUNSTATS INDEX ... FREQVAL NUMCOLS n - n>1 are often not useful

    • This statistics only cover frequent values of combination of ALL together columns listed.
    • Use ONLY for queries with EQUAL predicates with LITERALS
    • Dont use FREQVAL (BOTH or LEAST) use only MOST its valuable to the optimizer to know the highly skewed values.
  • Don't use HISTOGRAMS by default, they can quickly become stale for ascending columns(especially dates/timestamps)

  • From V12, INVALIDATECACHE NO is the default -- Running RUNSTATS will not purge Cached Dynamic Statements(DSC). If you want to invalidate DSC use

    • INVALIDATECACHE YES
    • REPORT NO UPDATE NONE
    • RESET ACCESS PATH
  • Runstats may not be necessary for VOLATILE tables as when Db2 executes queries against volatile tables, Db2 chooses index access whenever possible.

On REBINDs

Db2 provides many features to reduce the introduction of new access paths and providing stability and reliability in performance

  • APREUSE - To gain the benefit of new runtime structures - with reduced risk of regression
  • APCOMPARE - compare access path wehn you rebind, and either issue WARNing(WARN option) or terminate(ERROR option) if differences detected.
  • REBIND EXPLAIN(ONLY) ... or EXPLAIN PACKAGE -- check current access path
  • SWITCH - To revert to prior "good" performance. You should have original

New REBIND Behaviour from Db2 12 FL505

  • To allow BIND/REBIND to complete without requiring quiesce(exclusive access) of package. Will create a new CURRENT(phase-in) and existing will be the "phase-out" copy

Identifying Stale Statistics


Thats all for now. Thanks for reading