Trying to dive into ATtiny assembly

   or PORTB, R16 ; HI -> (clk, dio)
      :
   and PORTB, R16 ; LOW -> (clk, dio)

Those are not something you can do on an AVR. The instructions that take a PORT as an argument are very limited: IN/OUT, SBI/CBI, SBIS/SBIC...
You probably want

  cbi PORTB, CLK  ; clock low
  cbi PORTB, DIO  ; data low

At two instructions, 4 clocks and no registers used, that's shortest. It's also probably what the compiler produces.

   rcall TM1637_DELAY_US + 1

What are you expecting that "+ 1" to do?
Have you looked at the object code produced by the compiler? That's frequently a good idea in cases like this:

  • you can focus attention on areas that are "big"
  • you get some concrete examples of what sorts of instructions are available.