lspci tips on RedHat 7.9 with NVIDIA P100 GPU card.

#lspci | grep -i --color 'vga\|3d\|2d'

 

#sudo lshw -class display

#glxinfo | more

# glxinfo | egrep -I ‘device|memory’

# sudo lshw -c display

# glxinfo -B

Install P100 driver on RHEL 7.9

How to disable nouveau driver:

Append the following to the GRUB_CMDLINE_LINUX line:
modprobe.blacklist=nouveau
Save and close the file. Rebuild the grub config and restart the system as per BIOS or UEFI system.

BIOS USER run this

$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg

UEFI USER run this

$ sudo grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
Reboot the Linux box now:
$ sudo reboot

To stop:

# sudo init 3

To resume:

# sudo init 5

By default, the lspci command will display all of the devices information as shown below.

# lspci

In the above log, the NVIDIA P100 PCIE card is connected to Bus Number ‘25’, Device Number ‘00’ and Function Number ‘1’.

# lspci -tv | grep “NVIDIA”

 

#lspci | grep “NVIDIA”

#lscpi -s 25:00.0 -vvv | grep Speed

#lspci -s 25:00.0 -vvv | grep \\[

Checking PCIe Max Payload Size (MPS)

#lspci -s 25:00.0 -vvv | grep DevCtl: -C 2

 

In the PCI/PCI-X/PCI-E devices, there are BARs registers in the PCI configuration space. And during Linux Kernel booting up, it will scan the PCI bus, find all PCI devices including PCI-to-PCI bridge and PCI devices. And kernel will check how many BARs are there in the PCI devices' configuration space. And check how much memory space each BAR needs and the memory space type by writing 0xFFFFFFFF to BAR register. Then kernel will allocate the memory space resources to the PCI devices.

#lspci -s 25:00.0 -vvv

25:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1)

        Subsystem: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB]

        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Latency: 0, Cache Line Size: 32 bytes

        Interrupt: pin A routed to IRQ 153

        NUMA node: 0

        Region 0: Memory at a3000000 (32-bit, non-prefetchable) [size=16M]

        Region 1: Memory at 38b800000000 (64-bit, prefetchable) [size=16G]

        Region 3: Memory at 38bc00000000 (64-bit, prefetchable) [size=32M]

        Capabilities: [60] Power Management version 3

                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+

                Address: 00000000fee00078  Data: 0000

        Capabilities: [78] Express (v2) Endpoint, MSI 00

                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us

                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W

                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-

                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+

                        MaxPayload 256 bytes, MaxReadReq 512 bytes

                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-

                LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us

                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+

                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+

                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

                LnkSta: Speed 8GT/s (ok), Width x16 (ok)

                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

                DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+

                         10BitTagComp- 10BitTagReq- OBFF Via message, ExtFmt- EETLPPrefix-

                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-

                         FRS- TPHComp- ExtTPHComp-

                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-

                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,

                         AtomicOpsCtl: ReqEn-

                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-

                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-

                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

                         Compliance De-emphasis: -6dB

                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+

                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-

                         Retimer- 2Retimers- CrosslinkRes: unsupported

        Capabilities: [100 v1] Virtual Channel

                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1

                Arb:    Fixed- WRR32- WRR64- WRR128-

                Ctrl:   ArbSelect=Fixed

                Status: InProgress-

                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01

                        Status: NegoPending- InProgress-

        Capabilities: [250 v1] Latency Tolerance Reporting

                Max snoop latency: 0ns

                Max no snoop latency: 0ns

        Capabilities: [258 v1] L1 PM Substates

                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+

                          PortCommonModeRestoreTime=255us PortTPowerOnTime=10us

                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-

                           T_CommonMode=0us LTR1.2_Threshold=0ns

                L1SubCtl2: T_PwrOn=10us

        Capabilities: [128 v1] Power Budgeting <?>

        Capabilities: [420 v2] Advanced Error Reporting

                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-

                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-

                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-

                HeaderLog: 00000000 00000000 00000000 00000000

        Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>

        Capabilities: [900 v1] Secondary PCI Express

                LnkCtl3: LnkEquIntrruptEn- PerformEqu-

                LaneErrStat: 0

        Kernel driver in use: nouveau

        Kernel modules: nvidiafb, nouveau

Comments

Popular posts from this blog