◀ 8.

reshape)

▶

編集

この記事の要点

NumPy 配列の構造確認 4 大属性: shape / ndim / size / dtype
shape は次元ごとの長さの tuple、ndim は次元数、size は総要素数、dtype は要素の型
メモリサイズ確認: itemsize (1 要素のバイト数) × size = nbytes
形状変換: reshape / flatten / ravel / transpose
メモリレイアウト: C 順 (行優先) がデフォルト、order="F" で Fortran 順 (列優先)

NumPy 配列の主要属性

NumPy ndarray はベクトル化計算と多次元データの基盤です。Broadcasting や行列演算でハマったとき、まず形状 (shape) を確認するのが鉄則です。

属性	戻り値	用途
`arr.shape`	tuple	各次元の長さ
`arr.ndim`	int	次元数
`arr.size`	int	総要素数
`arr.dtype`	dtype	要素の型 (int32, float64...)
`arr.itemsize`	int	1 要素のバイト数
`arr.nbytes`	int	配列全体のバイト数
`arr.flags`	flags	メモリレイアウト (C/F 順) など
`arr.strides`	tuple	各軸を 1 進むのに必要なバイト数

具体例

import numpy as np

# 2 次元配列 (3 行 4 列)
arr = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
])

arr.shape        # (3, 4)
arr.ndim         # 2
arr.size         # 12
arr.dtype        # dtype('int64')  (Windows では int32 のことも)
arr.itemsize     # 8 (int64 = 8 バイト)
arr.nbytes       # 96 (8 * 12)

# 3 次元配列 (バッチ画像)
imgs = np.zeros((32, 224, 224, 3), dtype=np.float32)
imgs.shape       # (32, 224, 224, 3)
imgs.ndim        # 4
imgs.size        # 32*224*224*3 = 4_816_896
imgs.nbytes      # 19_267_584 (約 19 MB)

np.info() / repr() でまとめて確認

np.info(arr)
# class:  ndarray
# shape:  (3, 4)
# strides:  (32, 8)
# itemsize:  8
# aligned:  True
# contiguous:  True
# fortran:  False
# data pointer: 0x...
# byteorder:  little
# byteswap:  False
# type: int64

print(repr(arr))
# array([[ 1,  2,  3,  4],
#        [ 5,  6,  7,  8],
#        [ 9, 10, 11, 12]])

dtype 一覧

dtype	サイズ	範囲 / 説明
`int8` / `uint8`	1 byte	-128~127 / 0~255 (画像で頻出)
`int16` / `uint16`	2 byte	音声 PCM 等
`int32` / `uint32`	4 byte	Windows 規定の int
`int64` / `uint64`	8 byte	Linux 規定の int
`float16`	2 byte	半精度 (GPU 推論)
`float32`	4 byte	単精度 (深層学習標準)
`float64`	8 byte	倍精度 (Python の float 相当)
`complex64/128`	8/16 byte	複素数
`bool_`	1 byte	True/False
`object_`	―	Python オブジェクト (遅いので避ける)
`<U10`	40 byte	Unicode 文字列 (最大 10 文字)

# dtype 指定
arr = np.array([1, 2, 3], dtype=np.float32)
arr.dtype                 # dtype('float32')

# 型変換
arr2 = arr.astype(np.int8)

# よくあるバグ: float が混じると全体が float に
arr = np.array([1, 2, 3, 4.5])
arr.dtype                 # dtype('float64')   ← int じゃない!

形状変換

a = np.arange(12)
a.shape              # (12,)

# reshape (新しい view を返す)
b = a.reshape(3, 4)
b.shape              # (3, 4)

# -1 で自動推論 (1 つだけ可)
c = a.reshape(2, -1)         # (2, 6)
d = a.reshape(-1, 4)         # (3, 4)

# 平坦化
b.flatten()          # コピー (新規配列)
b.ravel()            # view (元と共有)

# 軸の入れ替え
img = np.zeros((224, 224, 3))   # H, W, C
img_chw = img.transpose(2, 0, 1)  # C, H, W (PyTorch 形式)
img.shape, img_chw.shape          # (224, 224, 3), (3, 224, 224)

# 次元追加
v = np.array([1, 2, 3])           # shape (3,)
v[:, None].shape                  # (3, 1)
v[None, :].shape                  # (1, 3)
np.expand_dims(v, axis=0).shape   # (1, 3)

Broadcasting の前提として shape を見る

NumPy の Broadcasting は末尾から各次元を比較し、同じか 1 なら適合します。事故を避けるには毎回 shape を確認します。

A = np.ones((3, 4))      # shape (3, 4)
B = np.ones(4)           # shape (4,)   → 末尾合致、(1, 4) に拡張
C = A + B                # shape (3, 4)

D = np.ones((3, 1))      # shape (3, 1)
E = A + D                # shape (3, 4)   ← 列方向に拡張

F = np.ones((3,))        # shape (3,)
# A + F  → ❌ ValueError: shape (3,) cannot broadcast with (3, 4)
# A.shape の末尾が 4 で、F は 3 なので不一致

メモリレイアウト (C 順 / F 順)

NumPy 配列はデフォルトでC 順 (行優先)。Fortran や MATLAB から移植したコードは F 順を期待することがあります。

arr_c = np.arange(12).reshape(3, 4)         # 規定で C 順
arr_f = np.arange(12).reshape(3, 4, order='F')

arr_c.flags['C_CONTIGUOUS']   # True
arr_c.flags['F_CONTIGUOUS']   # False

arr_c.strides                 # (32, 8)    行→列に進むのに 32 byte
arr_f.strides                 # (8, 24)    列方向に 8 byte

# transpose は基本 view (データは動かない)
arr_t = arr_c.T
arr_t.flags['C_CONTIGUOUS']   # False
arr_t.flags['F_CONTIGUOUS']   # True

list / Pandas との相互変換

# NumPy → Python list
arr.tolist()
# [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]

# Pandas DataFrame に
import pandas as pd
df = pd.DataFrame(arr, columns=['a', 'b', 'c', 'd'])
df.shape                  # (3, 4)
df.dtypes                 # int64 each

# DataFrame → NumPy
df.to_numpy()             # → ndarray
df.values                 # 旧 API (非推奨)

デバッグ Tips

# よくあるバグ調査の流れ
def debug_shape(name, arr):
    print(f"{name}: shape={arr.shape} ndim={arr.ndim} "
          f"dtype={arr.dtype} size={arr.size} nbytes={arr.nbytes}")

debug_shape('X', X)
debug_shape('W', W)
debug_shape('y', y)

# assert で形状を保証
def matmul(A, B):
    assert A.ndim == 2 and B.ndim == 2, f"need 2D, got {A.ndim}D and {B.ndim}D"
    assert A.shape[1] == B.shape[0], f"shape mismatch: {A.shape} @ {B.shape}"
    return A @ B

FAQ

Q: shape と len() の違いは？
A: len(arr) は最初の次元の長さ (= shape[0]) しか返しません。多次元は必ず shape。

Q: メモリ使用量を減らしたい
A: 用途に応じて dtype を絞る。画像なら uint8、フラグなら bool。深層学習は float32 / float16 で十分なことが多い。

Q: shape が (N,) と (N, 1) で何が違う？
A: 前者は 1 次元ベクトル、後者は 2 次元の列ベクトル。行列演算で動作が変わるため、Broadcasting の挙動を確認しながら使う。

編集

子ページ

子ページはありません

同階層のページ

2026-06-11 07:12:00 T 268

最近更新/作成されたページ

IPv6とは｜128bitアドレス・コロン16進表記/::省略・リンクローカル・SLAAC・デュアルスタック 2026-06-22 12:34:44
MAC アドレスフィルタリングの仕組みと限界 | ネットワーク入門 2026-06-22 12:19:10
VPNとは｜暗号トンネル・サイト間/リモートアクセス・IPsec/SSL-VPN/WireGuardを解説 2026-06-22 12:19:10
HTTP/2 とは多重化・HPACK・バイナリフレーム | ネットワーク入門 2026-06-22 12:17:25
gRPC とは HTTP/2 + Protocol Buffers の高速 RPC | ネットワーク入門 2026-06-22 12:17:25
WebSocket とは全二重リアルタイム通信 ws/wss | ネットワーク入門 2026-06-22 12:17:25
WebRTC とはブラウザ間 P2P の音声・映像・データ通信 | ネットワーク入門 2026-06-22 12:17:25
HTTP/3 (QUIC) とは UDP ベースの低遅延 Web 通信 | ネットワーク入門 2026-06-22 12:17:25
Web通信プロトコル入門 HTTP/2・HTTP/3・WebSocket・gRPC・WebRTC | ネットワーク入門 2026-06-22 12:17:25
HAProxy とは frontend/backend と設定例 | ネットワーク入門 2026-06-22 12:17:24
iptables/nftablesとは｜テーブル・チェーン・ルール例・永続化をLinux視点で解説 2026-06-22 12:17:24
CDN とはエッジキャッシュ・TTL・Cloudflare/CloudFront | ネットワーク入門 2026-06-22 12:17:24
TLS/SSLの仕組み｜ハンドシェイク・暗号スイート・前方秘匿性・証明書検証をわかりやすく解説 2026-06-22 12:17:24
ファイアウォールとは｜パケットフィルタ・ステートフル・DMZ・次世代FW(L4/L7)を解説 2026-06-22 12:17:24
証明書と認証局(CA)とは｜X.509・信頼チェーン・DV/OV/EV・失効(CRL/OCSP)を解説 2026-06-22 12:17:24

ページ一覧

その他